Diagnostic classification based on DNA methylation profiles using sequential machine learning approaches.
Marcin W WojewodzicJan P LavenderPublished in: PloS one (2024)
Aberrant methylation patterns in human DNA have great potential for the discovery of novel diagnostic and disease progression biomarkers. In this paper we used machine learning algorithms to identify promising methylation sites for diagnosing cancerous tissue and to classify patients based on methylation values at these sites. We used genome-wide DNA methylation patterns from both cancerous and normal tissue samples, obtained from the Genomic Data Commons consortium and trialled our methods on three types of urological cancer. A decision tree was used to identify the methylation sites most useful for diagnosis. The identified locations were then used to train a neural network to classify samples as either cancerous or non-cancerous. Using this two-step approach we found strong indicative biomarker panels for each of the three cancer types. These methods could likely be translated to other cancers and improved by using non-invasive liquid methods such as blood instead of biopsy tissue.
Keyphrases
- dna methylation
- genome wide
- machine learning
- copy number
- papillary thyroid
- neural network
- big data
- gene expression
- end stage renal disease
- artificial intelligence
- deep learning
- squamous cell
- chronic kidney disease
- ejection fraction
- newly diagnosed
- childhood cancer
- small molecule
- young adults
- mass spectrometry
- electronic health record
- cell free
- prognostic factors
- squamous cell carcinoma
- ultrasound guided
- nucleic acid
- fine needle aspiration