Login / Signup

MethylToSNP: identifying SNPs in Illumina DNA methylation array data.

Brenna A LaBarreAlexander GoncearencoHanna M PetrykowskaWeerachai JaratlerdsiriM S Riana BornmanVanessa M HayesLaura Elnitski
Published in: Epigenetics & chromatin (2019)
The benefits of this method are threefold. First, it prevents extensive data loss by considering only SNPs specific to the individuals in the study. Second, it offers the possibility to identify new polymorphisms in samples for which there is little known about the genetic landscape. Third, it identifies variants as they exist in functional regions of a genome, such as in CTCF (transcriptional repressor) sites and enhancers, that may be common alleles or personal mutations with potential to deleteriously affect genomic regulatory activities. We demonstrate that MethylToSNP is applicable to the Illumina 450K and Illumina 850K EPIC array data and is also backwards compatible to the 27K methylation arrays. Going forward, this kind of nuanced approach can increase the amount of information derived from precious data sets by considering samples of the project individually to enable more informed decisions about data cleaning.
Keyphrases
  • genome wide
  • dna methylation
  • electronic health record
  • big data
  • copy number
  • gene expression
  • high resolution
  • healthcare
  • high throughput
  • machine learning
  • mouse model
  • climate change
  • high density
  • heat shock protein