Exome sequence analysis identifies rare coding variants associated with a machine learning-based marker for coronary artery disease.
Ben Omega PetrazziniIain S ForrestGhislain RocheleauHa My T VyCarla Márquez-LunaÁine DuffyRobert ChenJoshua K ParkKyle GibsonSascha N GoonewardenaWaqas A MalickRobert S RosensonDaniel M JordanRon DoPublished in: Nature genetics (2024)
Coronary artery disease (CAD) exists on a spectrum of disease represented by a combination of risk factors and pathogenic processes. An in silico score for CAD built using machine learning and clinical data in electronic health records captures disease progression, severity and underdiagnosis on this spectrum and could enhance genetic discovery efforts for CAD. Here we tested associations of rare and ultrarare coding variants with the in silico score for CAD in the UK Biobank, All of Us Research Program and BioMe Biobank. We identified associations in 17 genes; of these, 14 show at least moderate levels of prior genetic, biological and/or clinical support for CAD. We also observed an excess of ultrarare coding variants in 321 aggregated CAD genes, suggesting more ultrarare variant associations await discovery. These results expand our understanding of the genetic etiology of CAD and illustrate how digital markers can enhance genetic association investigations for complex diseases.
Keyphrases
- coronary artery disease
- copy number
- genome wide
- electronic health record
- percutaneous coronary intervention
- cardiovascular events
- coronary artery bypass grafting
- machine learning
- risk factors
- small molecule
- molecular docking
- quality improvement
- type diabetes
- high intensity
- cross sectional
- cardiovascular disease
- atrial fibrillation
- deep learning
- genome wide analysis