The Tox21 10K Compound Library: Collaborative Chemistry Advancing Toxicology.
Ann M RichardRuili HuangSuramya WaidyanathaPaul ShinnBradley J CollinsInthirany ThillainadarajahChristopher M GrulkeAntony J WilliamsRyan R LougeeRichard S JudsonKeith A HouckMahmoud ShobairChihae YangJames F RathmanAdam YasgarSuzanne C FitzpatrickAnton SimeonovRussell S ThomasKevin M CroftonRichard S PaulesJohn R BucherChristopher P AustinRobert J KavlockRaymond R TicePublished in: Chemical research in toxicology (2020)
Since 2009, the Tox21 project has screened ∼8500 chemicals in more than 70 high-throughput assays, generating upward of 100 million data points, with all data publicly available through partner websites at the United States Environmental Protection Agency (EPA), National Center for Advancing Translational Sciences (NCATS), and National Toxicology Program (NTP). Underpinning this public effort is the largest compound library ever constructed specifically for improving understanding of the chemical basis of toxicity across research and regulatory domains. Each Tox21 federal partner brought specialized resources and capabilities to the partnership, including three approximately equal-sized compound libraries. All Tox21 data generated to date have resulted from a confluence of ideas, technologies, and expertise used to design, screen, and analyze the Tox21 10K library. The different programmatic objectives of the partners led to three distinct, overlapping compound libraries that, when combined, not only covered a diversity of chemical structures, use-categories, and properties but also incorporated many types of compound replicates. The history of development of the Tox21 "10K" chemical library and data workflows implemented to ensure quality chemical annotations and allow for various reproducibility assessments are described. Cheminformatics profiling demonstrates how the three partner libraries complement one another to expand the reach of each individual library, as reflected in coverage of regulatory lists, predicted toxicity end points, and physicochemical properties. ToxPrint chemotypes (CTs) and enrichment approaches further demonstrate how the combined partner libraries amplify structure-activity patterns that would otherwise not be detected. Finally, CT enrichments are used to probe global patterns of activity in combined ToxCast and Tox21 activity data sets relative to test-set size and chemical versus biological end point diversity, illustrating the power of CT approaches to discern patterns in chemical-activity data sets. These results support a central premise of the Tox21 program: A collaborative merging of programmatically distinct compound libraries would yield greater rewards than could be achieved separately.
Keyphrases
- quality improvement
- electronic health record
- high throughput
- big data
- healthcare
- computed tomography
- mental health
- magnetic resonance imaging
- data analysis
- high resolution
- hiv testing
- machine learning
- risk assessment
- positron emission tomography
- image quality
- hepatitis c virus
- living cells
- human health
- affordable care act