Long-read-based genome assembly reveals numerous endogenous viral elements in the green algal bacterivore Cymbomonas tetramitiformis.
Yangtsho GyaltshenAndrey RozenbergAmber PaaschJohn A BurnsSally WarringRaegan LarsonXyrus X Maurer-AlcaláJoel DacksApurva NarechaniaEunsoo KimPublished in: Genome biology and evolution (2023)
The marine tetraflagellate Cymbomonas tetramitiformis has drawn attention as an early diverging green alga that uses a phago-mixotrophic mode of nutrition (i.e., the ability to derive nourishment from both photosynthesis and bacterial prey). The Cymbomonas nuclear genome was sequenced previously but due to the exclusive use of short-read (Illumina) data, the assembly suffered from missing a large proportion of the genome's repeat regions. For this study, we generated Oxford Nanopore long-read and additional short-read Illumina data and performed a hybrid assembly that significantly improved the total assembly size and contiguity. Numerous endogenous viral elements were identified in the repeat regions of new assembly. These include the complete genome of a giant Algavirales virus along with many genomes of integrated Polinton-like viruses (PLVs) from two groups: Gezel-like PLVs and a novel group of prasinophyte-specific PLVs. The integrated ∼400 Kbp genome of the giant Algavirales virus is the first account of the association of the uncultured viral family AG_03 with green algae. The complete PLV genomes from C. tetramitiformis ranged between 15 and 25 Kbp in length and showed a diverse gene content. In addition, heliorhodopsin gene-containing repeat elements of mirusvirus origin were identified. These results illustrate past (and possibly ongoing) multiple alga-virus interactions that accompanied genome evolution of C. tetramitiformis.