Login / Signup

A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis.

Megan ShandJose SotoLee LichtensteinDavid BenjaminYossi FarjounYehuda BrodyYosef MaruvkaPaul C BlaineyEric Banks
Published in: Communications biology (2020)
Existing cancer benchmark data sets for human sequencing data use germline variants, synthetic methods, or expensive validations, none of which are satisfactory for providing a large collection of true somatic variation across a whole genome. Here we propose a data set, Lineage derived Somatic Truth (LinST), of short somatic mutations in the HT115 colon cancer cell-line, that are validated using a known cell lineage that includes thousands of mutations and a high confidence region covering 2.7 gigabases per sample.
Keyphrases
  • single cell
  • copy number
  • electronic health record
  • big data
  • papillary thyroid
  • squamous cell
  • bone marrow
  • mesenchymal stem cells
  • dna methylation
  • dna damage
  • cell fate
  • pluripotent stem cells