Login / Signup

Advances in cancer DNA methylation analysis with methPLIER: use of non-negative matrix factorization and knowledge-based constraints to enhance biological interpretability.

Ken TakasawaKen AsadaSyuzo KanekoKouya ShiraishiHidenori MachinoSatoshi TakahashiNorio ShinkaiNobuji KounoKazuma KobayashiMasaaki KomatsuTakaaki MizunoYu OkuboMasami MukaiTatsuya YoshidaYukihiro YoshidaHidehito HorinouchiShun-Ichi WatanabeYuichiro OheYasushi YatabeTakashi KohnoRyuji Hamamoto
Published in: Experimental & molecular medicine (2024)
DNA methylation is an epigenetic modification that results in dynamic changes during ontogenesis and cell differentiation. DNA methylation patterns regulate gene expression and have been widely researched. While tools for DNA methylation analysis have been developed, most of them have focused on intergroup comparative analysis within a dataset; therefore, it is difficult to conduct cross-dataset studies, such as rare disease studies or cross-institutional studies. This study describes a novel method for DNA methylation analysis, namely, methPLIER, which enables interdataset comparative analyses. methPLIER combines Pathway Level Information Extractor (PLIER), which is a non-negative matrix factorization (NMF) method, with regularization by a knowledge matrix and transfer learning. methPLIER can be used to perform intersample and interdataset comparative analysis based on latent feature matrices, which are obtained via matrix factorization of large-scale data, and factor-loading matrices, which are obtained through matrix factorization of the data to be analyzed. We used methPLIER to analyze a lung cancer dataset and confirmed that the data decomposition reflected sample characteristics for recurrence-free survival. Moreover, methPLIER can analyze data obtained via different preprocessing methods, thereby reducing distributional bias among datasets due to preprocessing. Furthermore, methPLIER can be employed for comparative analyses of methylation data obtained from different platforms, thereby reducing bias in data distribution due to platform differences. methPLIER is expected to facilitate cross-sectional DNA methylation data analysis and enhance DNA methylation data resources.
Keyphrases
  • dna methylation
  • gene expression
  • genome wide
  • data analysis
  • electronic health record
  • big data
  • free survival
  • cross sectional
  • copy number
  • machine learning
  • high throughput
  • artificial intelligence
  • rna seq
  • squamous cell