Login / Signup

slag: A program for seeded local assembly of genes in complex genomes.

Charles F CraneJill A NemacheckSubhashree SubramanyamChristie E WilliamsStephen B Goodwin
Published in: Molecular ecology resources (2022)
Although finished genomes have become more common, there is still a need for assemblies of individual genes or chromosomal regions when only unassembled reads are available. slag (Seeded Local Assembly of Genes) fulfils this need by performing iterative local assembly based on cycles of matching-read retrieval with blast and assembly with cap3, phrap, spades, canu or unicycler. The target sequence can be nucleotide or protein. Read fragmentation allows slag to use phrap or cap3 to assemble long reads at lower coverage (e.g., 5×) than is possible with canu or unicycler. In simple, nonrepetitive genomes, a slag assembly can cover a whole chromosome, but in complex genomes the growth of target-matching contigs is limited as additional reads are consumed by consensus contigs consisting of repetitive elements. Apart from genomic complexity, contig length and correctness depend on read length and accuracy. With pyrosequencing or Illumina reads, slag-assembled contigs are accurate enough to allow design of PCR primers, while contigs assembled from Oxford Nanopore or pre-HiFi Pacific Biosciences long reads are generally only accurate enough to design baiting sequences for further targeted sequencing. In an application with real reads, slag successfully extended sequences for four wheat genes, which were verified by cloning and Sanger sequencing of overlapping amplicons. slag is a robust alternative to atram2 for local assemblies, especially for read sets with less than 20× coverage. slag is freely available at https://github.com/cfcrane/SLAG.
Keyphrases
  • single molecule
  • genome wide
  • copy number
  • high resolution
  • genome wide identification
  • gene expression
  • magnetic resonance
  • dna methylation
  • quality improvement
  • clinical practice
  • protein protein
  • image quality