Scalable approaches for generating, validating and incorporating data from high-throughput functional assays to improve clinical variant classification.
Samskruthi Reddy PadigepatiDavid A StaffordChristopher A TanMelanie R SilvisKirsty JamiesonAndrew KeyserPaola Alejandra Correa NunezJohn M NicoludisToby MandersLaure FresardYuya KobayashiCarlos L ArayaSwaroop AradhyaBritt JohnsonKeith NykampJason A ReuterPublished in: Human genetics (2024)
As the adoption and scope of genetic testing continue to expand, interpreting the clinical significance of DNA sequence variants at scale remains a formidable challenge, with a high proportion classified as variants of uncertain significance (VUSs). Genetic testing laboratories have historically relied, in part, on functional data from academic literature to support variant classification. High-throughput functional assays or multiplex assays of variant effect (MAVEs), designed to assess the effects of DNA variants on protein stability and function, represent an important and increasingly available source of evidence for variant classification, but their potential is just beginning to be realized in clinical lab settings. Here, we describe a framework for generating, validating and incorporating data from MAVEs into a semi-quantitative variant classification method applied to clinical genetic testing. Using single-cell gene expression measurements, cellular evidence models were built to assess the effects of DNA variation in 44 genes of clinical interest. This framework was also applied to models for an additional 22 genes with previously published MAVE datasets. In total, modeling data was incorporated from 24 genes into our variant classification method. These data contributed evidence for classifying 4043 observed variants in over 57,000 individuals. Genetic testing laboratories are uniquely positioned to generate, analyze, validate, and incorporate evidence from high-throughput functional data and ultimately enable the use of these data to provide definitive clinical variant classifications for more patients.
Keyphrases
- high throughput
- electronic health record
- single cell
- machine learning
- gene expression
- big data
- deep learning
- copy number
- systematic review
- single molecule
- squamous cell carcinoma
- end stage renal disease
- cell free
- risk assessment
- mass spectrometry
- prognostic factors
- peritoneal dialysis
- amino acid
- bioinformatics analysis
- binding protein
- patient reported outcomes
- patient reported
- genome wide analysis