There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Extracting a subset of informative genes from microarray expression data is a critical data preparation step in cancer classification and other biological function analyses. Though many algorithms have been developed, the Support Vector Machine - Recursive Feature Elimination (SVM-RFE) algorithm is one of the best gene feature selection algorithms. It assumes that a smaller "filter-out" factor in the SVM-RFE, which results in a smaller number of gene features eliminated in each recursion, should lead to extraction of a better gene subset. Because the SVM-RFE is highly sensitive to the "filter-out" factor, our simulations have shown that this assumption is not always correct and that the SVM-RFE is an unstable algorithm. To select a set of key gene features for reliable prediction of cancer types or subtypes and other applications, a new two-stage SVM-RFE algorithm has been developed. It is designed to effectively eliminate most of the irrelevant, redundant and noisy genes while keeping information loss small at the first stage. A fine selection for the final gene subset is then performed at the second stage. The two-stage SVM-RFE overcomes the instability problem of the SVM-RFE to achieve better algorithm utility. We have demonstrated that the two-stage SVM-RFE is significantly more accurate and more reliable than the SVM-RFE and three correlation-based methods based on our analysis of three publicly available microarray expression datasets. Furthermore, the two-stage SVM-RFE is computationally efficient because its time complexity is O(d*log(2)d}, where d is the size of the original gene set.

Related collections

Author and article information

Journal

PubMed ID:: 17666757

DOI:: 10.1109/TCBB.2007.70224

ScienceOpen disciplines: Chemistry

Keywords: Algorithms,Artificial Intelligence,Diagnosis, Computer-Assisted,methods,Gene Expression Profiling,Humans,Neoplasm Proteins,metabolism,Neoplasms,diagnosis,Oligonucleotide Array Sequence Analysis,Pattern Recognition, Automated,Tumor Markers, Biological

Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis.

Read this article at

Abstract

Related collections

Methods by AKJournals

Author and article information

Journal

Comments

Comment on this article

Similar content 158

Cited by 18