An Optimized Framework for Cancer Prediction Using Immunosignature

Fatemeh Safaei Firouzabadi, Alireza Vard, Mohammadreza Sehhati, Mohammadreza Mohebian

DOI: 10.4103/jmss.JMSS_2_18

Abstract


Cancer is a complex disease which can engages the immune system of the patient. In this regard,
determination of distinct immunosignatures for various cancers has received increasing interest
recently. However, prediction accuracy and reproducibility of the computational methods are
limited. In this article, we introduce a robust method for predicting eight types of cancers including
astrocytoma, breast cancer, multiple myeloma, lung cancer, oligodendroglia, ovarian cancer, advanced
pancreatic cancer, and Ewing sarcoma. In the proposed scheme, at frst, the database is normalized
with a dictionary of normalization methods that are combined with particle swarm optimization (PSO)
for selecting the best normalization method for each feature. Then, statistical feature selection
methods are used to separate discriminative features and they were further improved by PSO with
appropriate weights as the inputs of the classifcation system. Finally, the support vector machines,
decision tree, and multilayer perceptron neural network were used as classifers. The performance of
the hybrid predictor was assessed using the holdout method. According to this method, the minimum
sensitivity, specifcity, precision, and accuracy of the proposed algorithm were 92.4 ± 1.1, 99.1 ± 1.1,
90.6 ± 2.1, and 98.3 ± 1.0, respectively, among the three types of classifcation that are used in our
algorithm. The proposed algorithm considers all the circumstances and works with each feature in
its special way. Thus, the proposed algorithm can be used as a promising framework for cancer
prediction with immunosignature.


Full Text:

PDF

References


Angenendt P. Progress in protein and antibody microarray technology. Drug Discov Today 2005;10:503-11.

Stafford P, Cichacz Z, Woodbury NW, Johnston SA. Immunosignature system for diagnosis of cancer. Proc Natl Acad Sci U S A 2014;111:E3072-80.

Otto T, Sicinski P. Cell cycle proteins as promising targets in cancer therapy. Nat Rev Cancer 2017;17:93-115.

Zhang PW, Chen L, Huang T, Zhang N, Kong XY, Cai YD, et al. Classifying ten types of major cancers based on reverse phase protein array profiles. PLoS One 2015;10:e0123147.

Kaddi CD, Wang MD. Models for predicting stage in head and neck squamous cell carcinoma using proteomic and transcriptomic data. IEEE J Biomed Health Inform 2017;21:246-53.

Mohebian MR, Marateb HR, Mansourian M, Mananas MA, Mokarian F. A hybrid computer-aided-diagnosis system for prediction of breast cancer recurrence (HPBCR) using optimized ensemble learning. Comput Struct Biotechnol J 2017;15:75-85.

Nguyen T, Nahavandi S. Modified AHP for gene selection and cancer classification using type-2 fuzzy logic. IEEE Trans Fuzzy Syst 2016;24:273-87.

Figueiredo A, Monteiro F, Sebastiana M. Subtilisin-like proteases in plant-pathogen recognition and immune priming: A perspective. Front Plant Sci 2014;5:739.

Xu H, Tian Y, Yuan X, Liu Y, Wu H, Liu Q, et al. Enrichment of CD44 in basal-type breast cancer correlates with EMT, cancer stem cell gene profile, and prognosis. Onco Targets Ther 2016;9:431-44.

Liu W, Ju Z, Lu Y, Mills GB, Akbani R. A comprehensive comparison of normalization methods for loading control and variance stabilization of reverse-phase protein array data. Cancer Inform 2014;13:109-17.

Giorgi FM, Bolger AM, Lohse M, Usadel B. Algorithm-driven artifacts in median polish summarization of microarray data. BMC Bioinformatics 2010;11:553.

Graf AA, Smola AJ, Borer S. Classification in a normalized feature space using support vector machines. IEEE Trans Neural Netw 2003;14:597-605.

Davatzikos C, Ruparel K, Fan Y, Shen DG, Acharyya M, Loughead JW, et al. Classifying spatial patterns of brain activity with machine learning methods: Application to lie detection. Neuroimage 2005;28:663-8.

Xing EP, Karp RM. CLIFF: Clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts. Bioinformatics 2001;17 Suppl 1:S306-15.

Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, et al. Normalization for cDNA microarray data: A robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 2002;30:e15.

Rudnick PA, Wang X, Yan X, Sedransk N, Stein SE. Improved normalization of systematic biases affecting ion current measurements in label-free proteomics data. Mol Cell Proteomics 2014;13:1341-51.

Scholma J, Fuhler GM, Joore J, Hulsman M, Schivo S, List AF, et al. Improved intra-array and interarray normalization of peptide microarray phosphorylation for phosphorylome and kinome profiling by rational selection of relevant spots. Sci Rep 2016;6:26695.

Bolstad BM. Comparing the effects of background, normalization and summarization on gene expression estimates. 2002. Available from: http://stat-www.berkeley.edu/users/bolstad/.

Birmingham A, Selfors LM, Forster T, Wrobel D, Kennedy CJ, Shanks E, et al. Statistical methods for analysis of high-throughput RNA interference screens. Nat Methods 2009;6:569-75.

Jain A, Nandakumar K, Ross A. Score normalization in multimodal biometric systems. Pattern Recognit 2005;38:2270-85.

Pelz CR, Kulesz-Martin M, Bagby G, Sears RC. Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data. BMC Bioinformatics 2008;9:520.

Chua SW, Vijayakumar P, Nissom PM, Yam CY, Wong VV, Yang H, et al. A novel normalization method for effective removal of systematic variation in microarray data. Nucleic Acids Res 2006;34:e38.

Sehhati MR, Dehnavi AM, Rabbani H, Javanmard SH. Using protein interaction database and support vector machines to improve gene signatures for prediction of breast cancer recurrence. J Med Signals Sens 2013;3:87-93.

Guyon I, Weston J, Barnhill S, Vapnik V. Gene selection for cancer classification using support vector machines. Mach Learn 2002;46:389-422.

Berger JA, Hautaniemi S, Mitra SK, Astola J. Jointly analyzing gene expression and copy number data in breast cancer using data reduction models. IEEE/ACM Trans Comput Biol Bioinform 2006;3:2-16.

Gharibi A, Sehhati MR, Vard A, Mohebian MR. Identification of gene signatures for classifying of breast cancer subtypes using protein interaction database and support vector machines. In: Computer and Knowledge Engineering (ICCKE), 2015, 5th International Conference on. Iran: Mashhad; IEEE; 2015.

Saeys Y, Inza I, Larranaga P. A review of feature selection techniques in bioinformatics. Bioinformatics 2007;23:2507-17.

Inza I, Larranaga P, Blanco R, Cerrolaza AJ. Filter versus wrapper gene selection approaches in DNA microarray domains. Artif Intell Med 2004;31:91-103.

Maldonado S, Weber R. A wrapper method for feature selection using support vector machines. Inf Sci 2009;179:2208-17.

Kohavi R, John GH. Wrappers for feature subset selection. Artif Intell 1997;97:273-324.

Destercke S, Strauss O. Kolmogorov-Smirnov test for interval data. In: International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems. Switzerland: Springer; 2014.

Heeren T, DAgostino R. Robustness of the two independent samples t-test when applied to ordinal scaled data. Stat Med 1987;6:79-90.

Birnbaum ZW. On a use of the Mann-Whitney statistic. In: Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability: Contributions to the Theory of Statistics. Vol. 1. California: Berkeley; The Regents of the University of California; 1956. p. 13-17.

Eberhart R, Kennedy J. A new optimizer using particle swarm

theory. In: Micro Machine and Human Science, MHS95,

Proceedings of the Sixth International Symposium on. Japan :

Nagoya; IEEE; 1995. p. 39-43.

Eberhart RC, Shi Y, Kennedy JF. Swarm Intelligence (The Morgan Kaufmann Series in Evolutionary Computation). 2001. p. 81-86.

Sahu A, Panigrahi SK, Pattnaik S. Fast convergence particle swarm optimization for functions optimization. Procedia Technol 2012;4:319-24.

Zhou X, Li Z, Dai Z, Zou X. QSAR modeling of peptide biological activity by coupling support vector machine with particle swarm optimization algorithm and genetic algorithm. J Mol Graph Model 2010;29:188-96.

Chinnaswamy A, Srinivasan R. Hybrid feature selection using correlation coefficient and particle swarm optimization on microarray gene expression data, In: Snasel V, Abraham A, Kromer P, Pant M, Muda A, editors. Innovations in Bio-Inspired Computing and Applications. Switzerland: Springer, Cham; 2016. p. 229-39.

Jain I, Jain VK, Jain R. Correlation feature selection based improved-binary particle swarm optimization for gene selection and cancer classification. Appl Soft Comput 2018;62:203-15.

Liu Y, Zheng X, Wang B, Zhou Sh, Zhou Ch. The optimization of DNA encoding based on chaotic optimization particle swarm algorithm. J Comput Theor Nanosci 2016;13:443-9.

Panda A, Ghoshal S, Konar A, Banerjee B, Nagar AK. Static learning particle swarm optimization with enhanced exploration and exploitation using adaptive swarm size. In: IEEE Congress on Evolutionary Computation (CEC 2016), Canada: Vancouver; 2016. p. 1869-76.

Chu Y, Mi H, Liao H, Ji Z, Wu QH. A fast bacterial swarming algorithm for high-dimensional function optimization. In: IEEE Congress on Evolutionary Computation, CEC 2008.(IEEE World Congress on Computational Intelligence), Hong Kong: IEEE Service Center; 2008. p. 3134-39.

Tran B, Xue B, Zhang M. Improved PSO for feature selection on high-dimensional datasets. In: Asia-Pacific Conference on Simulated Evolution and Learning. Lecture Notes in Computer Science ((LNCS, volume 8886), Cham, Switzerland: Springer; 2014. p. 503-15.

Kuksa PP, Min MR, Dugar R, Gerstein M. High-order neural networks and kernel methods for peptide-MHC binding prediction. Bioinformatics 2015;31:3600-7.

Kazemian HB, Yusuf SA, White K. Signal peptide discrimination and cleavage site identification using SVM and NN. Comput Biol Med 2014;45:98-110.

Lira F, Perez PS, Baranauskas JA, Nozawa SR. Prediction of antimicrobial activity of synthetic peptides by a decision tree model. Appl Environ Microbiol 2013;79:3156-9.

Hearst M, Dumais S, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intell Syst Appl 1998;13:18-28.

Zhang GL, Petrovsky N, Kwoh CK, August JT, Brusic V. PRED(TAP): A system for prediction of peptide binding to the human transporter associated with antigen processing. Immunome Res 2006;2:3.

Bhasin M, Raghava GP. SVM based method for predicting HLA-DRB10401 binding peptides in an antigen sequence. Bioinformatics 2004;20:421-3.

Wu KP, Wang SD. Choosing the kernel parameters for support vector machines by the inter-cluster distance in the feature space. Pattern Recognit 2009;42:710-7.

Raudys A, Long J. MLP based linear feature extraction for nonlinearly separable data. Pattern Anal Appl 2001;4:227-34.

Wei SH, Balch C, Paik HH, Kim YS, Baldwin RL, Liyanarachchi S, et al. Prognostic DNA methylation biomarkers in ovarian cancer. Clin Cancer Res 2006;12:2788-94.

Dehouck Y, Grosfils A, Folch B, Gilis D, Bogaerts P, Rooman M, et al. Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0. Bioinformatics 2009;25:2537-43.

Maetschke S, Towsey MW, Boden M. BLOMAP: An encoding of amino acids which improves signal peptide cleavage site prediction. 3rd Asia Pacific Bioinformatics Conference, Singapore; 2005. p. 141-50.

Gorynski K, Safian I, Gradzki W, MarszaJerzy MP, Krysinski J, Gorynski S, et al. Artificial neural networks approach to early lung cancer detection. Central European Journal of Medicine 2014;9:632-41.

Marcano-Cedeno A, Quintanilla-Dominguez J, Andina D. WBCD breast cancer database classification applying artificial metaplasticity neural network. Expert Syst Appl 2011;38:9573-9.

Abd El-Rehim DM, Ball G, Pinder SE, Rakha E, Paish C, Robertson JF, et al. High-throughput protein expression analysis using tissue microarray technology of a large well-characterised series identifies biologically distinct classes of breast cancer confirming recent cDNA expression analyses. Int J Cancer 2005;116:340-50.

Quinlan JR. Bagging, Boosting, and C4. 5. In: AAAI/IAAI. Vol. 1. California: Menlo Park; 1996. p. 725-30.

Salzberg S. Book Review: C4. 5: Programs for machine learning. Machine Learning 1993;16:235-40.

Vlahou A, Schorge JO, Gregory BW, Coleman RL. Diagnosis of ovarian cancer using decision tree classification of mass spectral data. J Biomed Biotechnol 2003;2003:308-14.

Su Y, Shen J, Qian H, Ma H, Ji J, Ma H, et al. Diagnosis of gastric cancer using decision tree classification of mass spectral data. Cancer Sci 2007;98:37-43.

Mousavizadegan M, Mohabatkar H. An evaluation on different machine learning algorithms for classification and prediction of antifungal peptides. Med Chem 2016;12:795-800.

Tsai MH, Wang HC, Lee GW, Lin YC, Chiu SH. A decision tree based classifier to analyze human ovarian cancer cDNA microarray datasets. J Med Syst 2016;40:21.

Banerjee A, Chitnis UB, Jadhav SL, Bhawalkar JS, Chaudhury S. Hypothesis testing, type I and type II errors. Ind Psychiatry J 2009;18:127-31.

Ellis PD. The Essential Guide to Effect Sizes: Statistical Power, Meta-Analysis, and the Interpretation of Research Results. Cambridge, UK: Cambridge University Press; 2010.

Sokolova M, Lapalme G. A systematic analysis of performance measures for classification tasks. Inf Process Manag 2009;45:427-37.

Chawla NV. Data mining for imbalanced datasets: An overview. In: Data Mining and Knowledge Discovery Handbook. Boston, MA: Springer; 2009. p. 875-86.

Rubin A. Statistics for Evidence-Based Practice and Evaluation. 3rd Edition, Boston, MA: Cengage Learning; 2012.

Suryanarayanan S, Reddy NP, Canilang EP. A fuzzy logic diagnosis system for classification of pharyngeal dysphagia. Int J Biomed Comput 1995;38:207-15


Refbacks

  • There are currently no refbacks.


 

  https://e-rasaneh.ir/Certificate/22728

https://e-rasaneh.ir/

ISSN : 2228-7477