Login / Signup

NMDtxDB: Data-driven identification and annotation of human NMD target transcripts.

Thiago Britto-BorgesNiels H GehringVolker BoehmChristoph Dieterich
Published in: RNA (New York, N.Y.) (2024)
The nonsense-mediated RNA decay (NMD) pathway is a crucial mechanism of mRNA quality control. Current annotations of NMD substrate RNAs are rarely data-driven, but use general established rules. We present a dataset with 4 cell lines and combinations for SMG5, SMG6 and SMG7 knockdowns or SMG7 knockout. Based on this dataset, we implemented a workflow that combines Nanopore and Illumina sequencing to assemble a transcriptome, which is enriched for NMD target transcripts. Moreover, we use coding sequence information from Ensembl, Gencode consensus RiboSeq ORFs and OpenProt to enhance the CDS annotation of novel transcript isoforms. In summary, 302,889 transcripts were obtained from the transcriptome assembly process, out of which, 24% are absent from Ensembl database annotations, 48,213 contain a premature stop codon and 6,433 are significantly upregulated in three or more comparisons of NMD active vs deficient cell lines. We present an in-depth view on these results through the NMDtxDB database, which is available at https://shiny.dieterichlab.org/app/NMDtxDB, and supports the study of NMD-sensitive transcripts. We open sourced our implementation of the respective web-application and analysis workflow at https://github.com/dieterich-lab/NMDtxDB and https://github.com/dieterich-lab/nmd-wf.
Keyphrases
  • rna seq
  • single cell
  • quality control
  • endothelial cells
  • gene expression
  • healthcare
  • genome wide
  • minimally invasive
  • single molecule
  • emergency department
  • clinical practice
  • wild type