Accurate detection of mosaic variants in sequencing data without matched controls.
Yanmei DouMinseok KwonRachel E RodinIsidro Cortés-CirianoRyan DoanLovelace J LuquetteAlon GalorCraig BohrsonChristopher A WalshPeter J ParkPublished in: Nature biotechnology (2020)
Detection of mosaic mutations that arise in normal development is challenging, as such mutations are typically present in only a minute fraction of cells and there is no clear matched control for removing germline variants and systematic artifacts. We present MosaicForecast, a machine-learning method that leverages read-based phasing and read-level features to accurately detect mosaic single-nucleotide variants and indels, achieving a multifold increase in specificity compared with existing algorithms. Using single-cell sequencing and targeted sequencing, we validated 80-90% of the mosaic single-nucleotide variants and 60-80% of indels detected in human brain whole-genome sequencing data. Our method should help elucidate the contribution of mosaic somatic mutations to the origin and development of disease.
Keyphrases
- single cell
- copy number
- machine learning
- rna seq
- big data
- electronic health record
- induced apoptosis
- high throughput
- loop mediated isothermal amplification
- artificial intelligence
- genome wide
- deep learning
- cell cycle arrest
- dna methylation
- magnetic resonance imaging
- real time pcr
- dna repair
- computed tomography
- oxidative stress
- mass spectrometry
- cell proliferation
- dna damage
- cell death