Login / Signup

Needlestack: an ultra-sensitive variant caller for multi-sample next generation sequencing data.

Tiffany Myriam DelhommePatrice H AvogbeAurélie A G GabrielNicolas AlcalaNoemie LeblayCatherine VoegeleMaxime ValléePriscilia ChopardAmélie ChabrierBehnoush Abedi-ArdekaniValérie GaborieauIvana HolcatovaVladimir JanoutLenka ForetováSasa MilosavljevicDavid ZaridzeAnush MukeriyaElisabeth BrambillaPaul BrennanGhislaine SceloLynnette Fernandez-CuestaGraham ByrnesFlorence L Calvez-KelmJames D McKayMatthieu Foll
Published in: NAR genomics and bioinformatics (2020)
The emergence of next-generation sequencing (NGS) has revolutionized the way of reaching a genome sequence, with the promise of potentially providing a comprehensive characterization of DNA variations. Nevertheless, detecting somatic mutations is still a difficult problem, in particular when trying to identify low abundance mutations, such as subclonal mutations, tumour-derived alterations in body fluids or somatic mutations from histological normal tissue. The main challenge is to precisely distinguish between sequencing artefacts and true mutations, particularly when the latter are so rare they reach similar abundance levels as artefacts. Here, we present needlestack, a highly sensitive variant caller, which directly learns from the data the level of systematic sequencing errors to accurately call mutations. Needlestack is based on the idea that the sequencing error rate can be dynamically estimated from analysing multiple samples together. We show that the sequencing error rate varies across alterations, illustrating the need to precisely estimate it. We evaluate the performance of needlestack for various types of variations, and we show that needlestack is robust among positions and outperforms existing state-of-the-art method for low abundance mutations. Needlestack, along with its source code is freely available on the GitHub platform: https://github.com/IARCbioinfo/needlestack.
Keyphrases
  • single cell
  • copy number
  • circulating tumor
  • emergency department
  • electronic health record
  • antibiotic resistance genes
  • genome wide
  • single molecule
  • anaerobic digestion
  • tandem mass spectrometry