Automated Recognition of RNA Structure Motifs by Their SHAPE Data Signatures.
Pierce RadeckiMirko LeddaSharon AviranPublished in: Genes (2018)
High-throughput structure profiling (SP) experiments that provide information at nucleotide resolution are revolutionizing our ability to study RNA structures. Of particular interest are RNA elements whose underlying structures are necessary for their biological functions. We previously introduced patteRNA, an algorithm for rapidly mining SP data for patterns characteristic of such motifs. This work provided a proof-of-concept for the detection of motifs and the capability of distinguishing structures displaying pronounced conformational changes. Here, we describe several improvements and automation routines to patteRNA. We then consider more elaborate biological situations starting with the comparison or integration of results from searches for distinct motifs and across datasets. To facilitate such analyses, we characterize patteRNA’s outputs and describe a normalization framework that regularizes results. We then demonstrate that our algorithm successfully discerns between highly similar structural variants of the human immunodeficiency virus type 1 (HIV-1) Rev response element (RRE) and readily identifies its exact location in whole-genome structure profiles of HIV-1. This work highlights the breadth of information that can be gleaned from SP data and broadens the utility of data-driven methods as tools for the detection of novel RNA elements.
Keyphrases
- human immunodeficiency virus
- antiretroviral therapy
- hepatitis c virus
- high throughput
- hiv infected
- hiv positive
- machine learning
- electronic health record
- hiv aids
- deep learning
- high resolution
- big data
- hiv testing
- genome wide
- nucleic acid
- men who have sex with men
- loop mediated isothermal amplification
- gene expression
- data analysis
- health information
- healthcare
- south africa
- dna methylation
- neural network