Login / Signup

Genome annotation of Caenorhabditis briggsae by TEC-RED identifies new exons, paralogs, and conserved and novel operons.

Nikita JhaveriWouter van den BergByung Joon HwangHans-Michael MullerPaul W SternbergBhagwati P Gupta
Published in: G3 (Bethesda, Md.) (2022)
The nematode Caenorhabditis briggsae is routinely used in comparative and evolutionary studies involving its well-known cousin Caenorhabditis elegans. The C. briggsae genome sequence has accelerated research by facilitating the generation of new resources, tools, and functional studies of genes. While substantial progress has been made in predicting genes and start sites, experimental evidence is still lacking in many cases. Here, we report an improved annotation of the C. briggsae genome using the trans-spliced exon coupled RNA end determination technique. In addition to identifying the 5' ends of expressed genes, we have discovered operons and paralogs. In summary, our analysis yielded 10,243 unique 5' end sequence tags with matches in the C. briggsae genome. Of these, 6,395 were found to represent 4,252 unique genes along with 362 paralogs and 52 previously unknown exons. These genes included 14 that are exclusively trans-spliced in C. briggsae when compared with C. elegans orthologs. A major contribution of this study is the identification of 492 high confidence operons, of which two-thirds are fully supported by tags. In addition, 2 SL1-type operons were discovered. Interestingly, comparisons with C. elegans showed that only 40% of operons are conserved. Of the remaining operons, 73 are novel, including 12 that entirely lack orthologs in C. elegans. Further analysis revealed that 4 of the 12 novel operons are conserved in Caenorhabditis nigoni. Altogether, the work described here has significantly advanced our understanding of the C. briggsae system and serves as a rich resource to aid biological studies involving this species.
Keyphrases
  • genome wide
  • bioinformatics analysis
  • dna methylation
  • genome wide identification
  • transcription factor
  • case control
  • genome wide analysis
  • single cell
  • high resolution
  • simultaneous determination