VUStruct: a compute pipeline for high throughput and personalized structural biology.
Christopher W MothJonathan H SheehanAbdullah Al MamunR Michael SivleyAlican GulsevinDavid C RinkerAnthony CapraJens MeilerPublished in: bioRxiv : the preprint server for biology (2024)
Effective diagnosis and treatment of rare genetic disorders requires the interpretation of a patient's genetic variants of unknown significance (VUSs). Today, clinical decision-making is primarily guided by gene-phenotype association databases and DNA-based scoring methods. Our web-accessible variant analysis pipeline, VUStruct, supplements these established approaches by deeply analyzing the downstream molecular impact of variation in context of 3D protein structure. VUStruct's growing impact is fueled by the co-proliferation of protein 3D structural models, gene sequencing, compute power, and artificial intelligence. Contextualizing VUSs in protein 3D structural models also illuminates longitudinal genomics studies and biochemical bench research focused on VUS, and we created VUStruct for clinicians and researchers alike. We now introduce VUStruct to the broad scientific community as a mature, web-facing, extensible, High Performance Computing (HPC) software pipeline. VUStruct maps missense variants onto automatically selected protein structures and launches a broad range of analyses. These include energy-based assessments of protein folding and stability, pathogenicity prediction through spatial clustering analysis, and machine learning (ML) predictors of binding surface disruptions and nearby post-translational modification sites. The pipeline also considers the entire input set of VUS and identifies genes potentially involved in digenic disease. VUStruct's utility in clinical rare disease genome interpretation has been demonstrated through its analysis of over 175 Undiagnosed Disease Network (UDN) Patient cases. VUStruct-leveraged hypotheses have often informed clinicians in their consideration of additional patient testing, and we report here details from two cases where VUStruct was key to their solution. We also note successes with academic research collaborators, for whom VUStruct has informed research directions in both computational genomics and wet lab studies.
Keyphrases
- artificial intelligence
- genome wide
- machine learning
- single cell
- protein protein
- binding protein
- high throughput
- copy number
- case report
- decision making
- amino acid
- big data
- single molecule
- healthcare
- staphylococcus aureus
- dna methylation
- palliative care
- high resolution
- cross sectional
- small molecule
- molecular dynamics simulations
- pseudomonas aeruginosa
- data analysis
- intellectual disability
- medical students