Advancing long-read nanopore genome assembly and accurate variant calling for rare disease detection.
Shloka NegiSarah L StentonSeth I BergerBrandy McNultyIvo ViolichJoshua GardnerTodd HillakerSara M O'RourkeMelanie C O'LearyElizabeth CarbonellChristina A Austin-TseGabrielle T LemireJillian SerranoBrian MangilogGrace E VanNoyMikhail KolmogorovEric VilainAnne H O'Donnell-LuriaEmmanuèle C DélotKaren H MigaJean MonlongBenedict PatenPublished in: medRxiv : the preprint server for health sciences (2024)
More than 50% of families with suspected rare monogenic diseases remain unsolved after whole genome analysis by short read sequencing (SRS). Long-read sequencing (LRS) could help bridge this diagnostic gap by capturing variants inaccessible to SRS, facilitating long-range mapping and phasing, and providing haplotype-resolved methylation profiling. To evaluate LRS's additional diagnostic yield, we sequenced a rare disease cohort of 98 samples, including 41 probands and some family members, using nanopore sequencing, achieving per sample ∼36x average coverage and 32 kilobase (kb) read N50 from a single flow cell. Our Napu pipeline generated assemblies, phased variants, and methylation calls. LRS covered, on average, coding exons in ∼280 genes and ∼5 known Mendelian disease genes that were not covered by SRS. In comparison to SRS, LRS detected additional rare, functionally annotated variants, including SVs and tandem repeats, and completely phased 87% of protein-coding genes. LRS detected additional de novo variants, and could be used to distinguish postzygotic mosaic variants from prezygotic de novos . Eleven probands were solved, with diverse underlying genetic causes including de novo and compound heterozygous variants, large-scale SVs, and epigenetic modifications. Our study demonstrates LRS's potential to enhance diagnostic yield for rare monogenic diseases, implying utility in future clinical genomics workflows.