New Developments and Possibilities in Reanalysis and Reinterpretation of Whole Exome Sequencing Datasets for Unsolved Rare Diseases Using Machine Learning Approaches.
Samarth Thonta SettyMarie-Pier Scott-BoyerTania CuppensArnaud DroitPublished in: International journal of molecular sciences (2022)
Rare diseases impact the lives of 300 million people in the world. Rapid advances in bioinformatics and genomic technologies have enabled the discovery of causes of 20-30% of rare diseases. However, most rare diseases have remained as unsolved enigmas to date. Newer tools and availability of high throughput sequencing data have enabled the reanalysis of previously undiagnosed patients. In this review, we have systematically compiled the latest developments in the discovery of the genetic causes of rare diseases using machine learning methods. Importantly, we have detailed methods available to reanalyze existing whole exome sequencing data of unsolved rare diseases. We have identified different reanalysis methodologies to solve problems associated with sequence alterations/mutations, variation re-annotation, protein stability, splice isoform malfunctions and oligogenic analysis. In addition, we give an overview of new developments in the field of rare disease research using whole genome sequencing data and other omics.
Keyphrases
- electronic health record
- end stage renal disease
- small molecule
- big data
- mental health
- machine learning
- chronic kidney disease
- high throughput
- gene expression
- ejection fraction
- copy number
- peritoneal dialysis
- artificial intelligence
- genome wide
- single cell
- binding protein
- amino acid
- sensitive detection
- patient reported