High-throughput functional annotation of natural products by integrated activity profiling.
Suzie K HightTrevor N ClarkKenji L KuritaElizabeth A McMillanWalter BrayAnam F ShaikhAswad S KhadilkarF P Jake HaecklFausto Carnevale-NetoScott LaAkshar LohithRachel M VadenJeon LeeShuguang WeiR Scott LokeyMichael A WhiteRoger G LiningtonJohn B MacMillanPublished in: Proceedings of the National Academy of Sciences of the United States of America (2022)
Determining mechanism of action (MOA) is one of the biggest challenges in natural products discovery. Here, we report a comprehensive platform that uses Similarity Network Fusion (SNF) to improve MOA predictions by integrating data from the cytological profiling high-content imaging platform and the gene expression platform Functional Signature Ontology, and pairs these data with untargeted metabolomics analysis for de novo bioactive compound discovery. The predictive value of the integrative approach was assessed using a library of target-annotated small molecules as benchmarks. Using Kolmogorov-Smirnov (KS) tests to compare in-class to out-of-class similarity, we found that SNF retains the ability to identify significant in-class similarity across a diverse set of target classes, and could find target classes not detectable in either platform alone. This confirmed that integration of expression-based and image-based phenotypes can accurately report on MOA. Furthermore, we integrated untargeted metabolomics of complex natural product fractions with the SNF network to map biological signatures to specific metabolites. Three examples are presented where SNF coupled with metabolomics was used to directly functionally characterize natural products and accelerate identification of bioactive metabolites, including the discovery of the azoxy-containing biaryl compounds parkamycins A and B. Our results support SNF integration of multiple phenotypic screening approaches along with untargeted metabolomics as a powerful approach for advancing natural products drug discovery.
Keyphrases
- high throughput
- mass spectrometry
- single cell
- liquid chromatography
- gene expression
- drug discovery
- high resolution
- rna seq
- small molecule
- ms ms
- gas chromatography
- gas chromatography mass spectrometry
- electronic health record
- poor prognosis
- big data
- dna methylation
- binding protein
- machine learning
- long non coding rna
- simultaneous determination