Encoding Genetic Circuits with DNA Barcodes Paves the Way for Machine Learning-Assisted Metabolite Biosensor Response Curve Profiling in Yeast.
Yikang ZhouYaomeng YuanYinan WuLu LiAysha JameelXin-Hui XingChong ZhangPublished in: ACS synthetic biology (2022)
Genetically encoded biosensors are valuable tools used in the precise engineering of metabolism. Although a large number of biosensors have been developed, the fine-tuning of their dose-response curves, which promotes the applications of biosensors in various scenarios, still remains challenging. To address this issue, we leverage a DNA trackable assembly method and fluorescence-activated cell sorting coupled with next-generation sequencing (FACS-seq) technology to set up a novel workflow for construction and comprehensive characterization of thousands of biosensors in a massively parallel manner. An FapR- fapO- based malonyl-CoA biosensor was used as proof of concept to construct a trackable combinatorial library, containing 5184 combinations with 6 levels of transcription factor dosage, 4 different operator positions, and 216 possible upstream enhancer sequence (UAS) designs. By applying the FACS-seq technique, the response curves of 2632 biosensors out of 5184 combinations were successfully characterized to provide large-scale genotype-phenotype association data of the designed biosensors. Finally, machine-learning algorithms were applied to predict the genotype-phenotype relationships of the uncharacterized combinations to generate a panoramic scanning map of the combinatorial space. With the assistance of our novel workflow, a malonyl-CoA biosensor with the largest dynamic response range was successfully obtained. Moreover, feature importance analysis revealed that the recognition sequence insertion scheme and the choice of UAS have a significant impact on the dynamic range. Taken together, our pipeline provides a platform for the design, tuning, and profiling of biosensor response curves and shows great potential to facilitate the rational design of genetic circuits.
Keyphrases
- machine learning
- label free
- single cell
- transcription factor
- gold nanoparticles
- genome wide
- sensitive detection
- quantum dots
- big data
- electronic health record
- climate change
- single molecule
- artificial intelligence
- stem cells
- high throughput
- circulating tumor
- high resolution
- dna methylation
- gene expression
- mesenchymal stem cells