HMZDupFinder: a robust computational approach for detecting intragenic homozygous duplications from exome sequencing data.
Haowei DuZain DardasAngad JollyChristopher M GrochowskiShalini N JhangianiHe LiDonna MuznyJawid M FatihGozde YesilNursel H ElçiogluAlper GezdiriciDana MarafiDavut PehlivanDaniel G CalameCláudia M B CarvalhoJennifer E PoseyTomasz GambinZeynep Coban-AkdemirJames R. LupskiPublished in: Nucleic acids research (2023)
Homozygous duplications contribute to genetic disease by altering gene dosage or disrupting gene regulation and can be more deleterious to organismal biology than heterozygous duplications. Intragenic exonic duplications can result in loss-of-function (LoF) or gain-of-function (GoF) alleles that when homozygosed, i.e. brought to homozygous state at a locus by identity by descent or state, could potentially result in autosomal recessive (AR) rare disease traits. However, the detection and functional interpretation of homozygous duplications from exome sequencing data remains a challenge. We developed a framework algorithm, HMZDupFinder, that is designed to detect exonic homozygous duplications from exome sequencing (ES) data. The HMZDupFinder algorithm can efficiently process large datasets and accurately identifies small intragenic duplications, including those associated with rare disease traits. HMZDupFinder called 965 homozygous duplications with three or less exons from 8,707 ES with a recall rate of 70.9% and a precision of 16.1%. We experimentally confirmed 8/10 rare homozygous duplications. Pathogenicity assessment of these copy number variant alleles allowed clinical genomics contextualization for three homozygous duplications alleles, including two affecting known OMIM disease genes EDAR (MIM# 224900), TNNT1(MIM# 605355), and one variant in a novel candidate disease gene: PAAF1.