CirComPara: A Multi-Method Comparative Bioinformatics Pipeline to Detect and Study circRNAs from RNA-seq Data.
Enrico GaffoAnnagiulia BonizzatoGeertruy Te KronnieStefania BortoluzziPublished in: Non-coding RNA (2017)
Circular RNAs (circRNAs) are generated by backsplicing of immature RNA forming covalently closed loops of intron/exon RNA molecules. Pervasiveness, evolutionary conservation, massive and regulated expression, and posttranscriptional regulatory roles of circRNAs in eukaryotes have been appreciated and described only recently. Moreover, being easily detectable disease markers, circRNAs undoubtedly represent a molecular class with high bearing on molecular pathobiology. CircRNAs can be detected from RNAseq data using appropriate computational methods to identify the sequence reads spanning backsplice junctions that do not colinearly map to the reference genome. To this end, several programs were developed and critical assessment of various strategies and tools suggested the combination of at least two methods as good practice to guarantee robust circRNA detection. Here,we present CirComPara (http://github.com/egaffo/CirComPara), an automated bioinformatics pipeline, to detect, quantify and annotate circRNAs from RNAseq data using in parallel four different methods for backsplice identification. CirComPara also provides quantification of linear RNAs and gene expression, ultimately comparing and correlating circRNA and gene/transcript expression level. We applied our method to RNAseqdata of monocyte and macrophage samples in relation to haploinsufficiency of the RNAbinding splicing factor Quaking (QKI). The biological relevance of the results, in terms of number, types and variations of circRNAs expressed, illustrates CirComPara potential to enlarge the knowledge of the transcriptome, adding details on the circRNAome, and facilitating further computational and experimental studies.
Keyphrases
- rna seq
- gene expression
- single cell
- genome wide
- poor prognosis
- electronic health record
- healthcare
- dna methylation
- big data
- primary care
- single molecule
- public health
- risk assessment
- machine learning
- binding protein
- neural network
- data analysis
- loop mediated isothermal amplification
- quantum dots
- human health
- amino acid