Updated Gene Prediction of the Cucumber (9930) Genome through Manual Annotation.
Weixuan DuLei XiaRui LiXiaokun ZhaoDanna JinXiaoning WangYun PeiRong ZhouJinfeng ChenXia-Qing YuPublished in: Plants (Basel, Switzerland) (2024)
Thorough and precise gene structure annotations are essential for maximizing the benefits of genomic data and unveiling valuable genetic insights. The cucumber genome was first released in 2009 and updated in 2019. To increase the accuracy of the predicted gene models, 64 published RNA-seq data and 9 new strand-specific RNA-seq data from multiple tissues were used for manual comparison with the gene models. The updated annotation file (V3.1) contains an increased number (24,145) of predicted genes compared to the previous version (24,317 genes), with a higher BUSCO value of 96.9%. A total of 6231 and 1490 transcripts were adjusted and newly added, respectively, accounting for 31.99% of the overall gene tally. These newly added and adjusted genes were renamed (CsaV3.1_XGXXXXX), while genes remaining unaltered preserved their original designations. A random selection of 21 modified/added genes were validated using RT-PCR analyses. Additionally, tissue-specific patterns of gene expression were examined using the newly obtained transcriptome data with the revised gene prediction model. This improved annotation of the cucumber genome will provide essential and accurate resources for studies in cucumber.