<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xml:lang="en"
    xmlns:mml="http://www.w3.org/1998/Math/MathML"
    xmlns:xlink="http://www.w3.org/1999/xlink"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <processing-meta tagset-family="jats" base-tagset="publishing" mathml-version="2.0" table-model="xhtml"/>
    <front>
                        
                        <journal-meta>
            <issn>1732-3916</issn>
                                </journal-meta>
        <article-meta>
            <title-group>
                                    <article-title>Feature Selection and Classification Pairwise Combinations for High-dimensional Tumour Biomedical Datasets</article-title>
                            </title-group>

                        <contrib-group>
                                                            <contrib contrib-type="author" corresp="yes">
                            <name>
                                <surname>Wosiak</surname>
                                <given-names>Agnieszka</given-names>
                            </name>
                            <role>author</role>
                                                                                                                                    <xref ref-type="aff" rid="aff-1"/>
                                                                                        <xref ref-type="corresp" rid="cor-1"/>
                        </contrib>
                                            <contrib contrib-type="author" corresp="no">
                            <name>
                                <surname>Dziomdziora</surname>
                                <given-names>Agata</given-names>
                            </name>
                            <role>author</role>
                                                                                                                                    <xref ref-type="aff" rid="aff-2"/>
                                                                                        <xref ref-type="corresp" rid="cor-2"/>
                        </contrib>
                                                </contrib-group>

                                                                                        <aff id="aff-1">
                    <institution-wrap>
                        <institution>Institute of Information Technology Lodz University of Technology</institution>
                                            </institution-wrap>
                </aff>
                                                                        
            <author-notes>
                                    <corresp id="cor-1">Correspondence to: Agnieszka Wosiak <email>agnieszka.wosiak@p.lodz.pl</email></corresp>
                                    <corresp id="cor-2">Correspondence to: Agata Dziomdziora <email></email></corresp>
                            </author-notes>

                            <pub-date date-type="pub" publication-format="electronic" iso-8601-date="2016-04-11">
                    <day>11</day>
                    <month>04</month>
                    <year>2016</year>
                </pub-date>
            
            <volume>Volume 24</volume>
            <issue>2015</issue>
                        <fpage>53</fpage>
                                    <lpage>62</lpage>
            
            <permissions>
                <copyright-statement>Copyright &#x00A9; 2016</copyright-statement>
                                    <copyright-year>2016</copyright-year>
                            </permissions>

            <funding-group specific-use="Crossref">
                <funding-statement></funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        &lt;p&gt;This paper concerns classification of high-dimensional yet small sample size biomedical data and feature selection aimed at reducing dimensionality of the microarray data. The research presents a comparison of pairwise combinations of six classification strategies, including decision trees, logistic model trees, Bayes network, Na¨ıve Bayes, k-nearest neighbours and sequential minimal optimization algorithm for training support vector machines, as well as seven attribute selection methods: Correlation-based Feature Selection, chi-squared, information gain, gain ratio, symmetrical uncertainty, ReliefF and SVM-RFE (Support Vector Machine-Recursive Feature Elimination). In this paper, SVMRFE feature selection technique combined with SMO classifier has demonstrated its potential ability to accurately and efficiently classify both binary and multiclass high-dimensional sets of tumour specimens.&lt;/p&gt;
    </body>
    <back>
                    <ref-list>
                                                                                <ref id="B1">
                            <label>1</label>
                            <article-title>[1] Chang C.-W., Cheng W.-C., Chen C.-R., Shu W.-Y., Tsai M.-L., et al., Identiﬁcation of Human Housekeeping Genes and Tissue-Selective Genes by Microarray Meta-Analysis. PLoS ONE, 2011, 6(7): e22859, doi:10.1371/journal.pone.0022859.</article-title>
                        </ref>
                                                                                                    <ref id="B2">
                            <label>2</label>
                            <article-title>[2] Dougherty E.R., Hua J., Sima C., Performance of Feature Selection Methods. Curr. Genomics. 2009, 10, pp. 365–374.</article-title>
                        </ref>
                                                                                                    <ref id="B3">
                            <label>3</label>
                            <article-title>[3] Eisenberg E., Levanon E.Y., Human housekeeping genes, revisited. Trends in Genetics, October 2013, 29(10), pp. 569–574, doi:10.1016/j.tig.2013.05.010.</article-title>
                        </ref>
                                                                                                    <ref id="B4">
                            <label>4</label>
                            <article-title>[4] Guyon I., Weston J., Barnhill S., Vapnik V., Gene selection for cancer classiﬁcation using support vector machines. Machine Learning, 2002, 46, pp. 389–422.</article-title>
                        </ref>
                                                                                                    <ref id="B5">
                            <label>5</label>
                            <article-title>[5] Janecek A., Gansterer W., Demel W., Ecker G., On the relationship between feature selection and classiﬁcation accuracy. Journal of Machine Learning and Research, 2008, 4, pp. 90–105.</article-title>
                        </ref>
                                                                                                    <ref id="B6">
                            <label>6</label>
                            <article-title>[6] Kumar A.P., Valsala P., Feature Selection for high Dimensional DNA Microarray data using hybrid approaches. Bioinformation, 2013, 9(16), pp. 824–828.</article-title>
                        </ref>
                                                                                                    <ref id="B7">
                            <label>7</label>
                            <article-title>[7] Li X., Lu H., Wang M., A Hybrid Gene Selection Method for Multi-category Tumor Classiﬁcation using Microarray Data. Int. J. Bioautomation, 2013, 17(4), pp. 249–258.</article-title>
                        </ref>
                                                                                                    <ref id="B8">
                            <label>8</label>
                            <article-title>[8] Li X., Peng S., Zhan X., Zhang J., Xu Y., Comparison of feature selection methods for multiclass cancer classiﬁcation based on microarray data. Proceedings of the 4th International Conference on Biomedical Engineering and Informatics (BMEI), 2011, 3, pp. 1692–1696.</article-title>
                        </ref>
                                                                                                    <ref id="B9">
                            <label>9</label>
                            <article-title>[9] Liu G., Kong L., Gopalakrishnan V., A Partitioning Based Adaptive Method for Robust Removal of Irrelevant Features from High-dimensional Biomedical Datasets. AMIA Summits on Translational Science Proceedings, 2012, pp. 52–61.</article-title>
                        </ref>
                                                                                                    <ref id="B10">
                            <label>10</label>
                            <article-title>[10] Podolak I. T., Roman A., CORES: fusion of supervised and unsupervised training methods for a multi-class classiﬁcation problem. Pattern Analysis and Applications, 2011, 14, pp. 395–413.</article-title>
                        </ref>
                                                                                                    <ref id="B11">
                            <label>11</label>
                            <article-title>[11] Saeys Y., Inaki I., Larranaga P., A review of feature selection techniques in bioinformatics. Bioinformatics, 2007, 23(19), pp. 2507–2517.</article-title>
                        </ref>
                                                                                                    <ref id="B12">
                            <label>12</label>
                            <article-title>[12] S´aez J.A., Luengo J., Stefanowski J., Herrera F., SMOTE-IPF: Addressing the noisy and borderline examples problem in imbalanced classiﬁcation by a resampling method with ﬁltering. Information Sciences, 10 January 2015, 291, pp. 184–203, http://dx.doi.org/10.1016/j.ins.2014.08.051. </article-title>
                        </ref>
                                                                                                    <ref id="B13">
                            <label>13</label>
                            <article-title>[13] Trevino V., Falciani F., Barrera-Saldana H.A., DNA Microarrays: a Powerful Genomic Tool for Biomedical and Clinical Research. Molecular Medicine, 2007, 13(9–10), pp. 527–541.</article-title>
                        </ref>
                                                                                                    <ref id="B14">
                            <label>14</label>
                            <article-title>[14] Wang X., Gotoh O., A Robust Gene Selection Method for Microarray-based Cancer Classiﬁcation. Cancer Informatics, 2010, 9, pp. 15–30.</article-title>
                        </ref>
                                                                                                    <ref id="B15">
                            <label>15</label>
                            <article-title>[15] Wang Y., Tetko I.V., Hall M.A., Frank E., Facius A., Mayer K.F., Gene selection from microarray data for cancer classiﬁcation–a machine learning approach. Comput. Biol. Chem., 2005, 29, pp. 37–46.</article-title>
                        </ref>
                                                                                                    <ref id="B16">
                            <label>16</label>
                            <article-title>[16] Wo´zniak M., Graa M., Corchado E., A survey of multiple classiﬁer systems as hybrid systems. Information Fusion, 2014, 16, pp. 3–17.</article-title>
                        </ref>
                                                                                                    <ref id="B17">
                            <label>17</label>
                            <article-title>[17] Zhang H., Wang H., Dai Z., Chen M.S., Yuan Z., Improving accuracy for cancer classiﬁcation with a new algorithm for genes selection. BMC Bioinformatics, 2012, 13 (298), pp. 1.</article-title>
                        </ref>
                                                </ref-list>
            </back>
</article>
