Cancer Classification Utilizing Voting Classifier with Ensemble Feature Selection Method and Transcriptomic Data.
Rabea KhatunMaksuda AkterMd Manowarul IslamMd Ashraf UddinMd Alamin TalukderJoarder KamruzzamanA K M AzadBikash Kumar PaulMuhammad Ali Abdulllah AlmoyadSunil AryalMohammad Ali MoniPublished in: Genes (2023)
Biomarker-based cancer identification and classification tools are widely used in bioinformatics and machine learning fields. However, the high dimensionality of microarray gene expression data poses a challenge for identifying important genes in cancer diagnosis. Many feature selection algorithms optimize cancer diagnosis by selecting optimal features. This article proposes an ensemble rank-based feature selection method (EFSM) and an ensemble weighted average voting classifier (VT) to overcome this challenge. The EFSM uses a ranking method that aggregates features from individual selection methods to efficiently discover the most relevant and useful features. The VT combines support vector machine, k-nearest neighbor, and decision tree algorithms to create an ensemble model. The proposed method was tested on three benchmark datasets and compared to existing built-in ensemble models. The results show that our model achieved higher accuracy, with 100% for leukaemia, 94.74% for colon cancer, and 94.34% for the 11-tumor dataset. This study concludes by identifying a subset of the most important cancer-causing genes and demonstrating their significance compared to the original data. The proposed approach surpasses existing strategies in accuracy and stability, significantly impacting the development of ML-based gene analysis. It detects vital genes with higher precision and stability than other existing methods.
Keyphrases
- machine learning
- papillary thyroid
- deep learning
- gene expression
- squamous cell
- big data
- convolutional neural network
- genome wide
- artificial intelligence
- bioinformatics analysis
- dna methylation
- lymph node metastasis
- magnetic resonance imaging
- electronic health record
- squamous cell carcinoma
- magnetic resonance
- young adults
- childhood cancer
- single cell