Login / Signup

Predicting Alu exonization in the human genome with a deep learning model.

Zitong HeOu ChenNoelani PhillipsGiulia Irene Maria PasquesiSarven SabunciyanLiliana D Florea
Published in: bioRxiv : the preprint server for biology (2024)
Alu exonization, or the recruitment of intronic Alu elements into gene sequences, has contributed to functional diversification; however, its extent and the ways in which it influences gene regulation are not fully understood. We developed an unbiased approach to predict Alu exonization events from genomic sequences implemented in a deep learning model, eXAlu, that overcomes the limitations of tissue or condition specificity and the computational burden of RNA-seq analysis. The model captures previously reported characteristics of exonized Alu sequences and can predict sequence elements important for Alu exonization. Using eXAlu, we estimate the number of Alu elements in the human genome undergoing exonization to be between 55-110K, 11-21 fold more than represented in the GENCODE gene database. Using RT-PCR we were able to validate selected predicted Alu exonization events, supporting the accuracy of our method. Lastly, we highlight a potential application of our method to identify polymorphic Alu insertion exonizations in individuals and in the population from whole genome sequencing data.
Keyphrases
  • deep learning
  • rna seq
  • endothelial cells
  • genome wide
  • copy number
  • single cell
  • induced pluripotent stem cells
  • gene expression
  • emergency department
  • risk assessment
  • electronic health record
  • climate change
  • human health