Zebrafish transposable elements show extensive diversification in age, genomic distribution, and developmental expression.
Ni-Chen ChangQuirze RoviraJonathan N WellsCédric FeschotteJuan M VaquerizasPublished in: Genome research (2022)
There is considerable interest in understanding the effect of transposable elements (TEs) on embryonic development. Studies in humans and mice are limited by the difficulty of working with mammalian embryos and by the relative scarcity of active TEs in these organisms. The zebrafish is an outstanding model for the study of vertebrate development, and over half of its genome consists of diverse TEs. However, zebrafish TEs remain poorly characterized. Here we describe the demography and genomic distribution of zebrafish TEs and their expression throughout embryogenesis using bulk and single-cell RNA sequencing data. These results reveal a highly dynamic genomic ecosystem comprising nearly 2000 distinct TE families, which vary in copy number by four orders of magnitude and span a wide range of ages. Longer retroelements tend to be retained in intergenic regions, whereas short interspersed nuclear elements (SINEs) and DNA transposons are more frequently found nearby or within genes. Locus-specific mapping of TE expression reveals extensive TE transcription during development. Although two-thirds of TE transcripts are likely driven by nearby gene promoters, we still observe stage- and tissue-specific expression patterns in self-regulated TEs. Long terminal repeat (LTR) retroelements are most transcriptionally active immediately following zygotic genome activation, whereas DNA transposons are enriched among transcripts expressed in later stages of development. Single-cell analysis reveals several endogenous retroviruses expressed in specific somatic cell lineages. Overall, our study provides a valuable resource for using zebrafish as a model to study the impact of TEs on vertebrate development.
Keyphrases
- copy number
- single cell
- genome wide
- poor prognosis
- mitochondrial dna
- rna seq
- type diabetes
- gene expression
- transcription factor
- high throughput
- climate change
- metabolic syndrome
- multidrug resistant
- circulating tumor
- long non coding rna
- skeletal muscle
- deep learning
- cell therapy
- mass spectrometry
- bone marrow
- big data
- high resolution
- high fat diet induced
- circulating tumor cells