Applying data science methodologies with artificial intelligence variant reinterpretation to map and estimate genetic disorder prevalence utilizing clinical data.
Suellen JacksonRebecca FreemanAdriana NoronhaHafsah JamilEric ChavezJason CarmichaelKaylee M RuizChristine MillerSarah BenkeRosalie PerrotMaryam HockleyKady MurphyAimiel CasillanLily RadanovichRoger DeforestMark E NunesCarolina I GalarretaRichard SidlowYaron EinhornJeremy D WoodsPublished in: American journal of medical genetics. Part A (2024)
Data science methodologies can be utilized to ascertain and analyze clinical genetic data that is often unstructured and rarely used outside of patient encounters. Genetic variants from all genetic testing resulting to a large pediatric healthcare system for a 5-year period were obtained and reinterpreted utilizing the previously validated Franklin© Artificial Intelligence (AI). Using PowerBI©, the data were further matched to patients in the electronic healthcare record to associate with demographic data to generate a variant data table and mapped by ZIP codes. Three thousand and sixty-five variants were identified and 98% were matched to patients with geographic data. Franklin© changed the interpretation for 24% of variants. One hundred and fifty-six clinically actionable variant reinterpretations were made. A total of 739 Mendelian genetic disorders were identified with disorder prevalence estimation. Mapping of variants demonstrated hot-spots for pathogenic genetic variation such as PEX6-associated Zellweger Spectrum Disorder. Seven patients were identified with Bardet-Biedl syndrome and seven patients with Rett syndrome amenable to newly FDA-approved therapeutics. Utilizing readily available software we developed a database and Exploratory Data Analysis (EDA) methodology enabling us to systematically reinterpret variants, estimate variant prevalence, identify conditions amenable to new treatments, and localize geographies enriched for pathogenic variants.
Keyphrases
- artificial intelligence
- big data
- data analysis
- electronic health record
- copy number
- healthcare
- machine learning
- end stage renal disease
- deep learning
- risk factors
- chronic kidney disease
- public health
- case report
- genome wide
- dna methylation
- gene expression
- small molecule
- spectrum disorder
- young adults
- mass spectrometry
- health information