Login / Signup

A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies.

Zilin LiXihao LiHufeng ZhouSheila M GaynorMargaret Sunitha SelvarajTheodore ArapoglouCorbin QuickYaowu LiuHan ChenRyan SunRounak DeyDonna K ArnettPaul L AuerLawrence F BielakJoshua C BisThomas W BlackwellJohn BlangeroEric BoerwinkleDonald W BowdenJennifer A BrodyBrian E CadeMatthew P ConomosAdolfo CorreaL Adrienne CupplesJoanne E CurranPaul S de VriesRavindranath DuggiralaNora FranceschiniBarry I FreedmanHarald H H GöringXiuqing GuoRita R KalyaniCharles KooperbergBrian G KralLeslie A LangeBridget M LinAni ManichaikulAlisa K ManningLisa W MartinRasika A MathiasJames B MeigsBraxton D MitchellMay E MontasserAlanna C MorrisonTake NaseriJeffrey R O'ConnellNicholette D D AllredPatricia A PeyserBruce M PsatyLaura M RaffieldSusan RedlineAlexander P ReinerMuagututi'a Sefuiva ReupenaKenneth M RiceStephen S RichJennifer A SmithKent D TaylorMargaret A TaubRamachandran S VasanDaniel E WeeksJames G WilsonLisa R YanekWei Zhaonull nullnull nullJerome I RotterCristen J WillerPradeep NatarajanGina M PelosoXihong Lin
Published in: Nature methods (2022)
Large-scale whole-genome sequencing studies have enabled analysis of noncoding rare-variant (RV) associations with complex human diseases and traits. Variant-set analysis is a powerful approach to study RV association. However, existing methods have limited ability in analyzing the noncoding genome. We propose a computationally efficient and robust noncoding RV association detection framework, STAARpipeline, to automatically annotate a whole-genome sequencing study and perform flexible noncoding RV association analysis, including gene-centric analysis and fixed window-based and dynamic window-based non-gene-centric analysis by incorporating variant functional annotations. In gene-centric analysis, STAARpipeline uses STAAR to group noncoding variants based on functional categories of genes and incorporate multiple functional annotations. In non-gene-centric analysis, STAARpipeline uses SCANG-STAAR to incorporate dynamic window sizes and multiple functional annotations. We apply STAARpipeline to identify noncoding RV sets associated with four lipid traits in 21,015 discovery samples from the Trans-Omics for Precision Medicine (TOPMed) program and replicate several of them in an additional 9,123 TOPMed samples. We also analyze five non-lipid TOPMed traits.
Keyphrases
  • genome wide
  • mycobacterium tuberculosis
  • copy number
  • gene expression
  • endothelial cells
  • high throughput
  • single cell
  • genome wide identification
  • quality improvement
  • data analysis