Automated Microbial Library Generation Using the Bioinformatics Platform IDBac.
Chase M ClarkLinh NguyenVan Cuong PhamLaura M SanchezBrian T MurphyPublished in: Molecules (Basel, Switzerland) (2022)
Libraries of microorganisms have served as a cornerstone of therapeutic drug discovery, though the continued re-isolation of known natural product chemical entities has remained a significant obstacle to discovery efforts. A major contributing factor to this redundancy is the duplication of bacterial taxa in a library, which can be mitigated through the use of a variety of DNA sequencing strategies and/or mass spectrometry-informed bioinformatics platforms so that the library is created with minimal phylogenetic, and thus minimal natural product overlap. IDBac is a MALDI-TOF mass spectrometry-based bioinformatics platform used to assess overlap within collections of environmental bacterial isolates. It allows environmental isolate redundancy to be reduced while considering both phylogeny and natural product production. However, manually selecting isolates for addition to a library during this process was time intensive and left to the researcher's discretion. Here, we developed an algorithm that automates the prioritization of hundreds to thousands of environmental microorganisms in IDBac. The algorithm performs iterative reduction of natural product mass feature overlap within groups of isolates that share high homology of protein mass features. Employing this automation serves to minimize human bias and greatly increase efficiency in the microbial strain prioritization process.
Keyphrases
- mass spectrometry
- machine learning
- deep learning
- drug discovery
- high throughput
- liquid chromatography
- gas chromatography
- microbial community
- human health
- high performance liquid chromatography
- genetic diversity
- capillary electrophoresis
- high resolution
- endothelial cells
- life cycle
- single cell
- small molecule
- magnetic resonance
- neural network
- risk assessment
- circulating tumor
- magnetic resonance imaging
- single molecule
- climate change
- ms ms
- amino acid
- quality improvement
- image quality
- nucleic acid