RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features.
Arfa HassanTamim AlkhalifahFahad AlturiseYaser Daanial KhanPublished in: Diagnostics (Basel, Switzerland) (2022)
To save lives from cancer, it is very crucial to diagnose it at its early stages. One solution to early diagnosis lies in the identification of the cancer driver genes and their mutations. Such diagnostics can substantially minimize the mortality rate of this deadly disease. However, concurrently, the identification of cancer driver gene mutation through experimental mechanisms could be an expensive, slow, and laborious job. The advancement of computational strategies that could help in the early prediction of cancer growth effectively and accurately is thus highly needed towards early diagnoses and a decrease in the mortality rates due to this disease. Herein, we aim to predict clear cell renal carcinoma (RCCC) at the level of the genes, using the genomic sequences. The dataset was taken from IntOgen Cancer Mutations Browser and all genes' standard DNA sequences were taken from the NCBI database. Using cancer-associated information of mutation from INTOGEN, the benchmark dataset was generated by creating the mutations in original sequences. After extensive feature extraction, the dataset was used to train ANN+ Hist Gradient boosting that could perform the classification of RCCC genes, other cancer-associated genes, and non-cancerous/unknown (non-tumor driver) genes. Through an independent dataset test, the accuracy observed was 83%, whereas the 10-fold cross-validation and Jackknife validation yielded 98% and 100% accurate results, respectively. The proposed predictor RCCC_Pred is able to identify RCCC genes with high accuracy and efficiency and can help scientists/researchers easily predict and diagnose cancer at its early stages.
Keyphrases
- papillary thyroid
- bioinformatics analysis
- genome wide
- genome wide identification
- machine learning
- lymph node metastasis
- squamous cell carcinoma
- type diabetes
- cardiovascular disease
- dna methylation
- gene expression
- single molecule
- cell free
- circulating tumor
- circulating tumor cells
- adverse drug
- health information
- genetic diversity