Login / Signup

Construction of a de novo assembly pipeline using multiple transcriptome data sets from Cypripedium macranthos (Orchidaceae).

Kota KambaraKaien FujinoHanako Shimura
Published in: PloS one (2023)
The family Orchidaceae comprises the most species of any monocotyledonous family and has interesting characteristics such as seed germination induced by mycorrhizal fungi and flower morphology that co-adapted with pollinators. In orchid species, genomes have been decoded for only a few horticultural species, and there is little genetic information available. Generally, for species lacking sequenced genomes, gene sequences are predicted by de novo assembly of transcriptome data. Here, we devised a de novo assembly pipeline for transcriptome data from the wild orchid Cypripedium (lady slipper orchid) in Japan by mixing multiple data sets and integrating assemblies to create a more complete and less redundant contig set. Among the assemblies generated by combining various assemblers, Trinity and IDBA-Tran yielded good assembly with higher mapping rates and percentages of BLAST hit contigs and complete BUSCO. Using this contig set as a reference, we analyzed differential gene expression between protocorms grown aseptically or with mycorrhizal fungi to detect gene expressions required for mycorrhizal interaction. A pipeline proposed in this study can construct a highly reliable contig set with little redundancy even when multiple transcriptome data are mixed, and can provide a reference that is adaptable to DEG analysis and other downstream analysis in RNA-seq.
Keyphrases
  • rna seq
  • gene expression
  • single cell
  • genome wide
  • electronic health record
  • big data
  • dna methylation
  • genetic diversity
  • data analysis
  • healthcare
  • machine learning
  • high resolution
  • mass spectrometry
  • social media