Improving Classifcation of Cancer and Mining Biomarkers from Gene Expression Profles Using Hybrid Optimization Algorithms and Fuzzy Support Vector Machine

Niloofar Yousefi Moteghaed, Keivan Maghooli, Masoud Garshasbi

DOI: 10.4103/jmss.JMSS_21_17

Abstract


Gene expression data are characteristically high dimensional with a small sample size in contrast
to the feature size and variability inherent in biological processes that contribute to difculties in
analysis. Selection of highly discriminative features decreases the computational cost and complexity
of the classifer and improves its reliability for prediction of a new class of samples. The present
study used hybrid particle swarm optimization and genetic algorithms for gene selection and a fuzzy
support vector machine (SVM) as the classifer. Fuzzy logic is used to infer the importance of each
sample in the training phase and decrease the outlier sensitivity of the system to increase the ability
to generalize the classifer. A decision‑tree algorithm was applied to the most frequent genes to
develop a set of rules for each type of cancer. This improved the abilities of the algorithm by fnding
the best parameters for the classifer during the training phase without the need for trial‑and‑error
by the user. The proposed approach was tested on four benchmark gene expression profles. Good
results have been demonstrated for the proposed algorithm. The classifcation accuracy for leukemia
data is 100%, for colon cancer is 96.67% and for breast cancer is 98%. The results show that the
best kernel used in training the SVM classifer is the radial basis function. The experimental results
show that the proposed algorithm can decrease the dimensionality of the dataset, determine the most
informative gene subset, and improve classifcation accuracy using the optimal parameters of the
classifer with no user interface.

Keywords


Cancer classifcation, fuzzy support vector machine, gene expression, genetic algorithm, particle swarm optimization algorithm

Full Text:

PDF

References


Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995;270:467-70.

Schena M, Shalon D, Heller R, Chai A, Brown PO, Davis RW, et al. Parallel human genome analysis: Microarray-based expression monitoring of 1000 genes. Proc Natl AcadSci USA 1996;93:10614-9.

Dehnavi AM, Sehhati MR, Rabbani H. Hybrid method for prediction of metastasis in breast cancer patients using gene expression signals. J Med Signals Sens 2013;3:79-86.

Sehhati MR, Dehnavi AM, Rabbani H, Javanmard SH. Using protein interaction database and support vector machines to improve gene signatures for prediction of breast cancer recurrence. J Med Signals Sens 2013;3:87-93.

Sehhati M, Mehridehnavi A, Rabbani H, Pourhossein M. Stable gene signature selection for prediction of breast cancer recurrence using joint mutual information. IEEE/ACM Trans Comput Biol Bioinform 2015;12:1440-8.

Chuang LY, Yang CH, Wu KC, Yang CH. A hybrid feature selection method for DNA microarray data. Comput Biol Med 2011;41:228-37.

Chuang LY, Yang CS, Wu KC, Yang CH. Gene selection and classification using Taguchi chaotic binary particle swarm optimization. Expert Syst Appl 2011;38:13367-77.

Shen Q, Mei Z, Ye BX. Simultaneous genes and training samples selection by modified particle swarm optimization for gene expression data classification. Comput Biol Med 2009;39:646-9.

Shen Q, Shi WM, Kong W, Ye BX. A combination of modified particle swarm optimization algorithm and support vector machine for gene selection and tumor classification. Talanta 2007;71:1679-83.

Li L, Jiang W, Li X, Moser KL, Guo Z, Du L, et al. A robust hybrid between genetic algorithm and support vector machine for extracting an optimal feature gene subset. Genomics 2005;85:16-23.

Hernandez Montiel LA, Huerta EB, Caporal RM. A multiple-fi lter-GA-SVM method for dimension reduction and classification of DNA-microarray data. Rev Mex Ing Biomed 2011;32:32-9.

Tong DL, Schierz AC. Hybrid genetic algorithm-neural network: Feature extraction for un preprocessed microarray data. Artif Intell Med 2011;53:47-56.

Li L, Weinberg CR, Darden TA, Pedersen LG. Gene selection for sample classification based on gene expression data: Study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 2001;17:1131-42.

Yang CH, Chuang LY, Yang CH. A hybrid fi lter/wrapper method for feature selection of microarray data. J Med Biol Eng 2009;30:23-8.

Chuang LY, Chang HW, Tu CJ, Yang CH. Improved binary PSO for feature selection using gene expression data. Comput Biol Chem 2008;32:29-37.

Shen Q, Shi WM, Kong W. Hybrid particle swarm optimization and tabu search approach for selecting genes for tumor classification using gene expression data. Comput Biol Chem 2008;32:52-9.

Martinez E, Alvarez MM, Trevino V. Compact cancer biomarkers discovery using a swarm intelligence feature selection algorithm. Comput Biol Chem 2010;34:244-50.

Lee CP, Leu Y. A novel hybrid feature selection method for microarray data analysis. Appl Soft Comput 2011;11:208-13.


Refbacks

  • There are currently no refbacks.


 

  https://e-rasaneh.ir/Certificate/22728

https://e-rasaneh.ir/

ISSN : 2228-7477