Mechanism and modeling of human disease-associated near-exon intronic variants that perturb RNA splicing.
Hung-Lun ChiangYi-Ting ChenJia-Ying SuHsin-Nan LinChen-Hsin Albert YuYu-Jen HungYun-Lin WangYen-Tsung HuangChien-Ling LinPublished in: Nature structural & molecular biology (2022)
It is estimated that 10%-30% of disease-associated genetic variants affect splicing. Splicing variants may generate deleteriously altered gene product and are potential therapeutic targets. However, systematic diagnosis or prediction of splicing variants is yet to be established, especially for the near-exon intronic splice region. The major challenge lies in the redundant and ill-defined branch sites and other splicing motifs therein. Here, we carried out unbiased massively parallel splicing assays on 5,307 disease-associated variants that overlapped with branch sites and collected 5,884 variants across the 5' splice region. We found that strong splice sites and exonic features preserve splicing from intronic sequence variation. Whereas the splice-altering mechanism of the 3' intronic variants is complex, that of the 5' is mainly splice-site destruction. Statistical learning combined with these molecular features allows precise prediction of altered splicing from an intronic variant. This statistical model provides the identity and ranking of biological features that determine splicing, which serves as transferable knowledge and out-performs the benchmarking predictive tool. Moreover, we demonstrated that intronic splicing variants may associate with disease risks in the human population. Our study elucidates the mechanism of splicing response of intronic variants, which classify disease-associated splicing variants for the promise of precision medicine.