Login / Signup

Statistical Models for High-Risk Intestinal Metaplasia with DNA Methylation Profiling.

Tianmeng WangYifei HuangJie Yang
Published in: Epigenomes (2024)
We consider the newly developed multinomial mixed-link models for a high-risk intestinal metaplasia (IM) study with DNA methylation data. Different from the traditional multinomial logistic models commonly used for categorical responses, the mixed-link models allow us to select the most appropriate link function for each category. We show that the selected multinomial mixed-link model (Model 1) using the total number of stem cell divisions (TNSC) based on DNA methylation data outperforms the traditional logistic models in terms of cross-entropy loss from ten-fold cross-validations with significant p -values 8.12×10-4 and 6.94×10-5. Based on our selected model, the significance of TNSC's effect in predicting the risk of IM is justified with a p -value less than 10-6. We also select the most appropriate mixed-link models (Models 2 and 3) when an additional covariate, the status of gastric atrophy, is available. When the status is negative, mild, or moderate, we recommend Model 2; otherwise, we prefer Model 3. Both Models 2 and 3 can predict the risk of IM significantly better than Model 1, which justifies that the status of gastric atrophy is informative in predicting the risk of IM.
Keyphrases
  • dna methylation
  • stem cells
  • gene expression
  • genome wide
  • electronic health record
  • machine learning
  • bone marrow
  • single cell
  • artificial intelligence
  • genome wide analysis