TRIPBASE: a database for identifying the human genomic DNA and lncRNA triplexes.
Tzu-Chieh LinYen-Ling LiuYu-Ting LiuWan-Hsin LiuZong-Yan LiuKai-Li ChangChin-Yao ChangHung Chih NiJia-Hsin HuangHuai-Kuang TsaiPublished in: NAR genomics and bioinformatics (2023)
Long-non-coding RNAs (lncRNAs) are defined as RNA sequences which are >200 nt with no coding capacity. These lncRNAs participate in various biological mechanisms, and are widely abundant in a diversity of species. There is well-documented evidence that lncRNAs can interact with genomic DNAs by forming triple helices (triplexes). Previously, several computational methods have been designed based on the Hoogsteen base-pair rule to find theoretical RNA-DNA:DNA triplexes. While powerful, these methods suffer from a high false-positive rate between the predicted triplexes and the biological experiments. To address this issue, we first collected the experimental data of genomic RNA-DNA triplexes from antisense oligonucleotide (ASO)-mediated capture assays and used Triplexator, the most widely used tool for lncRNA-DNA interaction, to reveal the intrinsic information on true triplex binding potential. Based on the analysis, we proposed six computational attributes as filters to improve the in-silico triplex prediction by removing most false positives. Further, we have built a new database, TRIPBASE, as the first comprehensive collection of genome-wide triplex predictions of human lncRNAs. In TRIPBASE, the user interface allows scientists to apply customized filtering criteria to access the potential triplexes of human lncRNAs in the cis -regulatory regions of the human genome. TRIPBASE can be accessed at https://tripbase.iis.sinica.edu.tw/.
Keyphrases
- endothelial cells
- circulating tumor
- long non coding rna
- nucleic acid
- genome wide
- cell free
- single molecule
- induced pluripotent stem cells
- dna methylation
- network analysis
- risk assessment
- pluripotent stem cells
- healthcare
- high throughput
- poor prognosis
- climate change
- transcription factor
- human health
- long noncoding rna
- genome wide identification