Login / Signup

A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing.

Tyler S AliotoIvo BuchhalterSophia DerdakBarbara HutterMatthew D EldridgeEivind HovigLawrence E HeislerTimothy A BeckJared T SimpsonLaurie TononAnne-Sophie SertierAnn-Marie PatchNatalie JägerPhilip GinsbachRuben M DrewsNagarajan ParamasivamRolf KabbeSasithorn ChotewutmontriNicolle DiesslChristopher PrevitiSabine SchmidtBenedikt BrorsLars FeuerbachMichael HeinoldSusanne GröbnerAndrey KorshunovPatrick S TarpeyAdam P ButlerJonathan HintonDavid JonesAndrew MenziesKeiran M RaineRebecca ShepherdLucy StebbingsJon W TeaguePaolo RibecaFrancesc Castro GinerSergi BeltranEmanuele RaineriMarc DabadSimon Charles HeathMarta GutRobert E DenrocheNicholas J HardingTakafumi N YamaguchiAkihiro FujimotoHidewaki NakagawaVíctor QuesadaRafael Valdés-MasSigve NakkenDaniel VodákLawrence BowerAndrew G LynchCharlotte L AndersonNicola WaddellJohn V PearsonSean M GrimmondMyron PetoPaul SpellmanMinghui HeCyriac KandothSemin LeeJohn ZhangLouis LétourneauSinger MaSahil B SethDavid TorrentsLiu XiDavid A WheelerCarlos López-OtínElías CampoPeter J CampbellPaul C BoutrosXose S PuenteDaniela S GerhardStefan M PfisterJohn D McPhersonThomas J HudsonMatthias SchlesnerPeter LichterRoland EilsDavid T W JonesIvo Glynne Gut
Published in: Nature communications (2015)
As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼ 100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.
Keyphrases
  • papillary thyroid
  • squamous cell
  • single cell
  • squamous cell carcinoma
  • healthcare
  • machine learning
  • electronic health record
  • childhood cancer
  • dna methylation
  • big data
  • data analysis
  • deep learning