Login / Signup

CONFINED: distinguishing biological from technical sources of variation by leveraging multiple methylation datasets.

Mike ThompsonZeyuan Johnson ChenElior RahmaniEran Halperin
Published in: Genome biology (2019)
Methylation datasets are affected by innumerable sources of variability, both biological (cell-type composition, genetics) and technical (batch effects). Here, we propose a reference-free method based on sparse canonical correlation analysis to separate the biological from technical sources of variability. We show through simulations and real data that our method, CONFINED, is not only more accurate than the state-of-the-art reference-free methods for capturing known, replicable biological variability, but it is also considerably more robust to dataset-specific technical variability than previous approaches. CONFINED is available as an R package as detailed at https://github.com/cozygene/CONFINED .
Keyphrases
  • drinking water
  • dna methylation
  • molecular dynamics
  • high resolution
  • machine learning
  • data analysis
  • single cell