Misclassification-Driven Sample Relabeling for Supervised Kernel Principal Component Analysis

Maciej  Adamiak; Krzysztof Ślot

Misclassification-Driven Sample Relabeling for Supervised Kernel Principal Component Analysis

Publication date: 24.03.2017

Schedae Informaticae, 2016, Volume 25, pp. 25-35

https://doi.org/10.4467/20838476SI.16.002.6183

Authors

Download full text

PDF

Titles

Misclassification-Driven Sample Relabeling for Supervised Kernel Principal Component Analysis

Abstract

Abstract. Supervised kernel-Principal Component Analysis (S-kPCA) is a me thod for producing discriminative feature spaces that provide nonlinear decision regions, well-suited for handling real-world problems. The presented paper proposes a modification to the original S-kPCA concept, which is aimed at improving class-separation in resulting feature spaces. This is accomplished by identifying outliers (understood here as misclassified samples) and by an appropriate reformulation of the original S-kPCA problem. The proposed idea is to replace binary class labels that are used in the original method, by real-valued ones, derived using sample-relabeling scheme aimed at preventing potential data classification problems. The postulated concept has been tested on three standard pattern recognition datasets. It has been shown that classification performance in feature spaces derived using the introduced methodology improves by 4–16% with respect to the original S-kPCA method, depending on a dataset.

Keywords

pattern recognition, feature extraction, kernel methods, supervised kernel PCA.

References

Download references

[1] Burges C.J., A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 1998, 2 (2), pp. 121–168.

[2] Bengio Y., Learning deep architectures for ai. Foundations and Trends in Machine Learning, 2009, 2 (1), pp. 1–127.

[3] Reynolds D., Gaussian mixture models. Encyclopedia of Biometrics, 2015, pp. 827–832.

[4] Comon P., Independent component analysis: a new concept? Signal Processing, 1994, 36 (3), pp. 287–314.

[5] Bach F.R., Jordan M.I., Kernel independent component analysis. Journal of Machine Learning Research, 2002, 3, pp. 1–48.

[6] Barshan E., Ghodsi A., Azimifar Z., Jahromi M.Z., Supervised principal component analysis: visualization, classification and regression on subspaces and submanifolds. Pattern Recognition, 2011, 44, pp. 1357–1371.

[7] Sch¨olkopf B., Smola A., M¨uller K.R., Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 1998, 10, pp. 1299–1319.

[8] Hofmann T., Sch¨olkopf B., Smola A.J., Kernel methods in machine learning. The Annals of Statistics, 2008, 36 (3), pp. 1171–1220.

[9] Smola A.J., Sch¨olkopf B., Learning with Kernels. MIT Press, 2002.

[10] Wang M., Sha F., Jordan M.I., Unsupervised kernel dimension reduction. Proc.of Conf. Advances in Neural Information Processing Systems, 2010, 23, pp. 2379–2387.

[11] Mika S., R¨atsch G., Scholkoph W.J., M¨uller K.R., Fisher discriminant analysis with kernels. Proc. of IEEE Conf. Neural Networks for Signal Processing, 1999, pp. 41–48.

[12] Baudat G., Anouar F., Feature vector selection and projection using kernels. Neurocomputing, 2003, 55, pp. 21–38.

[13] Song L., Smola A., Gretton A., Bedo J., Borgwardt K., Feature selection via dependence maximization. Journal of Machine Learning Research, 2012, 13, pp. 1393–1434.

[14] Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., Duchesnay E., Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 2011, 12, pp. 2825–2830.

[15] Little M.A., McSharry P.E., Roberts S.J., Costello D.A., Moroz I.M., Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection-6. BioMedical Engineering OnLine, 2011, 6 (1), pp. 23.

[16] Hungarian Institute of Cardiology. Budapest: Andras Janosi M.D., University Hospital Zurich, Switzerland: William Steinbrunn M.D., University Hospital Basel, Switzerland: Matthias Pfisterer M.D., V.A. Medical Center Long Beach and Cleveland Clinic Foundation:Robert Detrano M.D. Ph.D., Heart Disease Data Set. [online].

[17] Moshe L., UCI machine learning repository, 2013.

[18] Chapelle O., Vapnik V., Bousquet O., Mukherjee S., Choosing multiple parameters for support vector machines. Machine Learning, 2002, 46 (1), pp. 131–159.

Information

Information: Schedae Informaticae, 2016, Volume 25, pp. 25-35

DOI: https://doi.org/10.4467/20838476SI.16.002.6183

Article type: Original article

Authors

Maciej Adamiak

Faculty of Geographical Sciences, University of Lodz
ul. Narutowicza 65, 90-131 Łódź, Poland

Krzysztof Ślot

Published at: 24.03.2017

Article status: Open

Licence: None

Percentage share of authors:

Maciej Adamiak (Author) - 50%

Krzysztof Ślot (Author) - 50%

Article corrections:

Publication languages:

English

View count: 2180

Number of downloads: 1704

<div id="cke_pastebin">Misclassification-Driven Sample Relabeling for Supervised Kernel Principal Component Analysis</div>

Download xml