St. Jude Cloud: A Pediatric Cancer Genomic Data-Sharing Ecosystem.
Clay McLeodAlexander M GoutXin ZhouAndrew ThrasherDelaram RahbariniaSamuel W BradyMichael MaciasKirby BirchDavid FinkelsteinJobin SunnyRahul MudunuriBrent A OrrMadison TreadwayBob DavidsonTracy K ArdArthur ChiaoAndrew SwistakStephanie WigginsScott FoyJian WangEdgar SiosonShuoguo WangJ Robert MichaelYu LiuXiaotu MaAman PatelMichael N EdmonsonMark R WilkinsonAndrew M FrantzTi-Cheng ChangLiqing TianShaohua LeiS M Ashiqul IslamChristopher MeyerNaina ThangarajPamella TaterVijay KandaliTuan NguyenOmar SerangIrina McGuireNedra RobisonDarrell GentryXing TangLance E PalmerGang WuEd SuhLeigh TannerJames McMurryMatthew LearAlberto S PappoZhaoming WangCarmen L WilsonYong ChengSoheil MeshinchiLudmil B AlexandrovMitchell J WeissGregory T ArmstrongLeslie L RobisonYutaka YasuiKim E NicholsDavid W EllisonChaitanya BangurCharles G MullighanSuzanne J BakerMichael A DyerGeralyn MillerScott NewmanMichael C RuschRichard DalyKeith PerryJames R DowningJinghui ZhangPublished in: Cancer discovery (2021)
Effective data sharing is key to accelerating research to improve diagnostic precision, treatment efficacy, and long-term survival in pediatric cancer and other childhood catastrophic diseases. We present St. Jude Cloud (https://www.stjude.cloud), a cloud-based data-sharing ecosystem for accessing, analyzing, and visualizing genomic data from >10,000 pediatric patients with cancer and long-term survivors, and >800 pediatric sickle cell patients. Harmonized genomic data totaling 1.25 petabytes are freely available, including 12,104 whole genomes, 7,697 whole exomes, and 2,202 transcriptomes. The resource is expanding rapidly, with regular data uploads from St. Jude's prospective clinical genomics programs. Three interconnected apps within the ecosystem-Genomics Platform, Pediatric Cancer Knowledgebase, and Visualization Community-enable simultaneously performing advanced data analysis in the cloud and enhancing the Pediatric Cancer knowledgebase. We demonstrate the value of the ecosystem through use cases that classify 135 pediatric cancer subtypes by gene expression profiling and map mutational signatures across 35 pediatric cancer subtypes. SIGNIFICANCE: To advance research and treatment of pediatric cancer, we developed St. Jude Cloud, a data-sharing ecosystem for accessing >1.2 petabytes of raw genomic data from >10,000 pediatric patients and survivors, innovative analysis workflows, integrative multiomics visualizations, and a knowledgebase of published data contributed by the global pediatric cancer community.This article is highlighted in the In This Issue feature, p. 995.
Keyphrases
- papillary thyroid
- childhood cancer
- data analysis
- electronic health record
- squamous cell
- big data
- climate change
- healthcare
- machine learning
- copy number
- young adults
- lymph node metastasis
- mental health
- gene expression
- public health
- single cell
- end stage renal disease
- ejection fraction
- high throughput
- risk assessment
- combination therapy
- artificial intelligence
- peritoneal dialysis