alexa Feature selection and nearest centroid classification for protein mass spectrometry.
Bioinformatics & Systems Biology

Bioinformatics & Systems Biology

Journal of Proteomics & Bioinformatics

Author(s): Levner I

Abstract Share this page

Abstract BACKGROUND: The use of mass spectrometry as a proteomics tool is poised to revolutionize early disease diagnosis and biomarker identification. Unfortunately, before standard supervised classification algorithms can be employed, the "curse of dimensionality" needs to be solved. Due to the sheer amount of information contained within the mass spectra, most standard machine learning techniques cannot be directly applied. Instead, feature selection techniques are used to first reduce the dimensionality of the input space and thus enable the subsequent use of classification algorithms. This paper examines feature selection techniques for proteomic mass spectrometry. RESULTS: This study examines the performance of the nearest centroid classifier coupled with the following feature selection algorithms. Student-t test, Kolmogorov-Smirnov test, and the P-test are univariate statistics used for filter-based feature ranking. From the wrapper approaches we tested sequential forward selection and a modified version of sequential backward selection. Embedded approaches included shrunken nearest centroid and a novel version of boosting based feature selection we developed. In addition, we tested several dimensionality reduction approaches, namely principal component analysis and principal component analysis coupled with linear discriminant analysis. To fairly assess each algorithm, evaluation was done using stratified cross validation with an internal leave-one-out cross-validation loop for automated feature selection. Comprehensive experiments, conducted on five popular cancer data sets, revealed that the less advocated sequential forward selection and boosted feature selection algorithms produce the most consistent results across all data sets. In contrast, the state-of-the-art performance reported on isolated data sets for several of the studied algorithms, does not hold across all data sets. CONCLUSION: This study tested a number of popular feature selection methods using the nearest centroid classifier and found that several reportedly state-of-the-art algorithms in fact perform rather poorly when tested via stratified cross-validation. The revealed inconsistencies provide clear evidence that algorithm evaluation should be performed on several data sets using a consistent (i.e., non-randomized, stratified) cross-validation procedure in order for the conclusions to be statistically sound.
This article was published in BMC Bioinformatics and referenced in Journal of Proteomics & Bioinformatics

Relevant Expert PPTs

Relevant Speaker PPTs

Recommended Conferences

  • 9th International Conference on Bioinformatics
    October 23-24, 2017 Paris, France
  • 9th International Conference and Expo on Proteomics
    October 23-25, 2017 Paris, France

Relevant Topics

Peer Reviewed Journals
 
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals
International Conferences 2017-18
 
Meet Inspiring Speakers and Experts at our 3000+ Global Annual Meetings

Contact Us

Agri, Food, Aqua and Veterinary Science Journals

Dr. Krish

[email protected]

1-702-714-7001 Extn: 9040

Clinical and Biochemistry Journals

Datta A

[email protected]

1-702-714-7001Extn: 9037

Business & Management Journals

Ronald

[email protected]

1-702-714-7001Extn: 9042

Chemical Engineering and Chemistry Journals

Gabriel Shaw

[email protected]

1-702-714-7001 Extn: 9040

Earth & Environmental Sciences

Katie Wilson

[email protected]

1-702-714-7001Extn: 9042

Engineering Journals

James Franklin

[email protected]

1-702-714-7001Extn: 9042

General Science and Health care Journals

Andrea Jason

[email protected]

1-702-714-7001Extn: 9043

Genetics and Molecular Biology Journals

Anna Melissa

[email protected]

1-702-714-7001 Extn: 9006

Immunology & Microbiology Journals

David Gorantl

[email protected]

1-702-714-7001Extn: 9014

Informatics Journals

Stephanie Skinner

[email protected]

1-702-714-7001Extn: 9039

Material Sciences Journals

Rachle Green

[email protected]

1-702-714-7001Extn: 9039

Mathematics and Physics Journals

Jim Willison

[email protected]

1-702-714-7001 Extn: 9042

Medical Journals

Nimmi Anna

[email protected]

1-702-714-7001 Extn: 9038

Neuroscience & Psychology Journals

Nathan T

[email protected]

1-702-714-7001Extn: 9041

Pharmaceutical Sciences Journals

John Behannon

[email protected]

1-702-714-7001Extn: 9007

Social & Political Science Journals

Steve Harry

[email protected]

1-702-714-7001 Extn: 9042

 
© 2008-2017 OMICS International - Open Access Publisher. Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version
adwords