Identification of annotation artifacts concerning the chalcone synthase (CHS).
Martin BartasAdriana VolnaJiri CervenBoas PuckerPublished in: BMC research notes (2023)
CHS genes with an apparent triplication of the CHS domain encoding part were discovered through database searches. Such genes were found in Macadamia integrifolia, Musa balbisiana, Musa troglodytarum, and Nymphaea colorata. A manual inspection of the CHS gene models in these four species with massive RNA-seq data suggests that these gene models are the result of artificial fusions in the annotation process. While there are hundreds of seemingly correct CHS records in the databases, it is not clear why these annotation artifacts appeared.