Login / Signup

Epigenetic Stress and Long-Read cDNA Sequencing of Sunflower ( Helianthus annuus L.) Revealed the Origin of the Plant Retrotranscriptome.

Ilya N RozochkinPavel MerkulovEkaterina PolkhovskayaZakhar S KonstantinovMikhail KazancevKsenia SaenkoAlexander PolkhovskiyMaxim DudnikovTsovinar GaribyanYakov DemurinAlexander Soloviev
Published in: Plants (Basel, Switzerland) (2022)
Transposable elements (TEs) contribute not only to genome diversity but also to transcriptome diversity in plants. To unravel the sources of LTR retrotransposon (RTE) transcripts in sunflower, we exploited a recently developed transposon activation method ('TEgenesis') along with long-read cDNA Nanopore sequencing. This approach allows for the identification of 56 RTE transcripts from different genomic loci including full-length and non-autonomous RTEs. Using the mobilome analysis, we provided a new set of expressed and transpositional active sunflower RTEs for future studies. Among them, a Ty3/Gypsy RTE called SUNTY3 exhibited ongoing transposition activity, as detected by eccDNA analysis. We showed that the sunflower genome contains a diverse set of non-autonomous RTEs encoding a single RTE protein, including the previously described TR-GAG (terminal repeat with the GAG domain) as well as new categories, TR-RT-RH, TR-RH, and TR-INT-RT. Our results demonstrate that 40% of the loci for RTE-related transcripts (nonLTR-RTEs) lack their LTR sequences and resemble conventional eucaryotic genes encoding RTE-related proteins with unknown functions. It was evident based on phylogenetic analysis that three nonLTR-RTEs encode GAG (HadGAG1-3) fused to a host protein. These HadGAG proteins have homologs found in other plant species, potentially indicating GAG domestication. Ultimately, we found that the sunflower retrotranscriptome originated from the transcription of active RTEs, non-autonomous RTEs, and gene-like RTE transcripts, including those encoding domesticated proteins.
Keyphrases
  • genome wide
  • dna methylation
  • single cell
  • copy number
  • single molecule
  • gene expression
  • amino acid
  • binding protein
  • drinking water
  • genome wide association
  • drug induced