Deep structured learning for variant prioritization in Mendelian diseases.
Vance P LemmonMaike F DohrnSarah FazalDanique BeijerAdriana P RebeloVivian CintraStephan ZuchnerPublished in: Nature communications (2023)
Effective computer-aided or automated variant evaluations for monogenic diseases will expedite clinical diagnostic and research efforts of known and novel disease-causing genes. Here we introduce MAVERICK: a Mendelian Approach to Variant Effect pRedICtion built in Keras. MAVERICK is an ensemble of transformer-based neural networks that can classify a wide range of protein-altering single nucleotide variants (SNVs) and indels and assesses whether a variant would be pathogenic in the context of dominant or recessive inheritance. We demonstrate that MAVERICK outperforms all other major programs that assess pathogenicity in a Mendelian context. In a cohort of 644 previously solved patients with Mendelian diseases, MAVERICK ranks the causative pathogenic variant within the top five variants in over 95% of cases. Seventy-six percent of cases were solved by the top-ranked variant. MAVERICK ranks the causative pathogenic variant in hitherto novel disease genes within the first five candidate variants in 70% of cases. MAVERICK has already facilitated the identification of a novel disease gene causing a degenerative motor neuron disease. These results represent a significant step towards automated identification of causal variants in patients with Mendelian diseases.