Multi-omics analysis reveals the functional transcription and potential translation of enhancers.
Yingcheng WuYang YangHongyan GuBaorui TaoErhao ZhangJinhuan WeiZhou WangAifen LiuRong SunMiaomiao ChenYihui FanRenfang MaoPublished in: International journal of cancer (2020)
Enhancer can transcribe RNAs, however, most of them were neglected in traditional RNA-seq analysis workflow. Here, we developed a Pipeline for Enhancer Transcription (PET, http://fun-science.club/PET) for quantifying enhancer RNAs (eRNAs) from RNA-seq. By applying this pipeline on lung cancer samples and cell lines, we showed that the transcribed enhancers are enriched with histone marks and transcription factor motifs (JUNB, Hand1-Tcf3 and GATA4). By training a machine learning model, we demonstrate that enhancers can predict prognosis better than their nearby genes. Integrating the Hi-C, ChIP-seq and RNA-seq data, we observe that transcribed enhancers associate with cancer hallmarks or oncogenes, among which LcsMYC-1 (Lung cancer-specific MYC eRNA-1) potentially supports MYC expression. Surprisingly, a significant proportion of transcribed enhancers contain small protein-coding open reading frames (sORFs) and can be translated into microproteins. Our study provides a computational method for eRNA quantification and deepens our understandings of the DNA, RNA and protein nature of enhancers.
Keyphrases
- rna seq
- transcription factor
- single cell
- binding protein
- machine learning
- genome wide identification
- high throughput
- dna binding
- computed tomography
- electronic health record
- positron emission tomography
- squamous cell carcinoma
- gene expression
- big data
- amino acid
- protein protein
- circulating tumor
- circulating tumor cells
- squamous cell
- artificial intelligence
- deep learning
- human health