Identifying mislabeled and contaminated DNA methylation microarray data: an extended quality control toolset with examples from GEO.
Jonathan A HeissAllan C JustPublished in: Clinical epigenetics (2018)
A more complete examination of samples that may be mislabeled, contaminated, or have poor performance due to technical problems will improve downstream analyses and replication of findings. We demonstrate that quality control problems are prevalent in a public repository of DNA methylation data. We advocate for a more thorough quality control workflow in epigenome-wide association studies and provide a software package to perform the checks described in this work. Reproducible code and supplementary material are available at 10.5281/zenodo.1172730.