Login / Signup

SCISSOR: a framework for identifying structural changes in RNA transcripts.

Hyo Young ChoiHeejoon JoXiaobei ZhaoKatherine A HoadleyScott NewmanJeremiah HoltMichele C HaywardMichael I LoveJ S MarronDavid Neil Hayes
Published in: Nature communications (2021)
High-throughput sequencing protocols such as RNA-seq have made it possible to interrogate the sequence, structure and abundance of RNA transcripts at higher resolution than previous microarray and other molecular techniques. While many computational tools have been proposed for identifying mRNA variation through differential splicing/alternative exon usage, challenges in its analysis remain. Here, we propose a framework for unbiased and robust discovery of aberrant RNA transcript structures using short read sequencing data based on shape changes in an RNA-seq coverage profile. Shape changes in selecting sample outliers in RNA-seq, SCISSOR, is a series of procedures for transforming and normalizing base-level RNA sequencing coverage data in a transcript independent manner, followed by a statistical framework for its analysis ( https://github.com/hyochoi/SCISSOR ). The resulting high dimensional object is amenable to unsupervised screening of structural alterations across RNA-seq cohorts with nearly no assumption on the mutational mechanisms underlying abnormalities. This enables SCISSOR to independently recapture known variants such as splice site mutations in tumor suppressor genes as well as novel variants that are previously unrecognized or difficult to identify by any existing methods including recurrent alternate transcription start sites and recurrent complex deletions in 3' UTRs.
Keyphrases
  • rna seq
  • single cell
  • high throughput
  • high throughput sequencing
  • copy number
  • electronic health record
  • machine learning
  • big data
  • small molecule
  • healthcare
  • transcription factor
  • gene expression
  • working memory