Login / Signup

Tracing foreign sequences in plant transcriptomes and genomes using OCT4, a POU domain protein.

Adeleh SaffarMaryam M Matin
Published in: Molecular genetics and genomics : MGG (2021)
Contaminations in sequencing data, especially in reference genomes, lead to inevitable errors in downstream analyses. Similarly, presence of contaminants in transcriptomes, misrepresents the molecular basis of various interactions. In this study, we report the presence of a large number of plant transcriptomes contaminated with RNAs encoding POU domain proteins; a family of proteins that has not been reported in plants and fungi. Besides, our findings illustrated that there are four POU domain protein-coding sequences in the reference genome of Rhodamnia argentea. It turned out that the existing foreign fragments are related to arthropods that are considered as plant pests. We also identified two contaminated draft genomes, Humulus lupulus and Cannabis sativa that contained complete rDNA sequences originating from Tetranychus species. As a result, careful screening of sequencing data before releasing them in public databases or checking existing genomes for possible contaminations is recommended.
Keyphrases