CAGI experiments: Modeling sequence variant impact on gene splicing using predictions from computational tools.
Valer GoteaGennady MargolinLaura ElnitskiPublished in: Human mutation (2019)
Improving predictions of phenotypic consequences for genomic variants is part of ongoing efforts in the scientific community to gain meaningful insights into genomic function. Within the framework of the critical assessment of genome interpretation experiments, we participated in the Vex-seq challenge, which required predicting the change in the percent spliced in measure (ΔΨ) for 58 exons caused by more than 1,000 genomic variants. Experimentally determined through the Vex-seq assay, the Ψ quantifies the fraction of reads that include an exon of interest. Predicting the change in Ψ associated with specific genomic variants implies determining the sequence changes relevant for splicing regulators, such as splicing enhancers and silencers. Here we took advantage of two computational tools, SplicePort and SPANR, that incorporate relevant sequence features in their models of splice sites and exon-inclusion level, respectively. Specifically, we used the SplicePort and SPANR outputs to build mathematical models of the experimental data obtained for the variants in the training set, which we then used to predict the ΔΨ associated with the mutations in the test set. We show that the sequence changes captured by these computational tools provide a reasonable foundation for modeling the impact on splicing associated with genomic variants.