Login / Signup

Finding and extending ancient simple sequence repeat-derived regions in the human genome.

Jonathan A ShorttRobert P RuggieroCorey CoxAaron C WacholderDavid D Pollock
Published in: Mobile DNA (2020)
Our analysis indicates that the amount of likely SSR-derived sequence in the human genome is 6.77%, over twice as much as previous estimates, including millions of newly identified ancient SSR-derived loci. SSR-clouds identified poly-A sequences adjacent to transposable element termini in over 74% of the oldest class of Alu (roughly, AluJ), validating the sensitivity of the approach. Poly-A's annotated by SSR-clouds also had a length distribution that was more consistent with their poly-A origins, with mean about 35 bp even in older Alus. This work demonstrates that the high sensitivity provided by SSR-Clouds improves the detection of SSR-derived regions and will enable deeper analysis of how decaying repeats contribute to genome structure.
Keyphrases
  • genetic diversity
  • endothelial cells
  • genome wide
  • physical activity
  • pluripotent stem cells
  • gene expression