A comprehensive comparison of residue-level methylation levels with the regression-based gene-level methylation estimations by ReGear.
Jinpu CaiYuyang XuWen ZhangShiying DingYuewei SunJingyi LyuMeiyu DuanShuai LiuLan HuangFengfeng ZhouPublished in: Briefings in bioinformatics (2021)
A comprehensive evaluation was conducted to compare the gene and cytosine methylation levels for their associations with the sample labels and classification performances. The unsupervised clustering was also improved using the gene methylation levels. Some genes demonstrated statistically significant associations with the class label, even when no residue-level methylation features have statistically significant associations with the class label. So in summary, the trained gene methylation levels improved various methylome-based machine learning models. Both methodology development of regression algorithms and experimental validation of the gene-level methylation biomarkers are worth of further investigations in the future studies. The source code, example data files and manual are available at http://www.healthinformaticslab.org/supp/.