Exploring the Nonconserved Sequence Space of Synthetic Expression Modules in Bacillus subtilis.
Christopher SauerEmiel Ver Loren van ThemaatLeonie G M BoenderDaphne GroothuisRita CruzLeendert W HamoenColin R HarwoodTjeerd van RijPublished in: ACS synthetic biology (2018)
Increasing protein expression levels is a key step in the commercial production of enzymes. Predicting promoter activity and translation initiation efficiency based solely on consensus sequences have so far met with mixed results. Here, we addressed this challenge using a "brute-force" approach by designing and synthesizing a large combinatorial library comprising ∼12 000 unique synthetic expression modules (SEMs) for Bacillus subtilis. Using GFP fluorescence as a reporter of gene expression, we obtained a dynamic expression range that spanned 5 orders of magnitude, as well as a maximal 13-fold increase in expression compared with that of the already strong veg expression module. Analyses of the synthetic modules indicated that sequences at the 5'-end of the mRNA were the most important contributing factor to the differences in expression levels, presumably by preventing formation of strong secondary mRNA structures that affect translation initiation. When the gfp coding region was replaced by the coding region of the xynA gene, encoding the industrially relevant B. subtilis xylanase enzyme, only a 3-fold improvement in xylanase production was observed. Moreover, the correlation between GFP and xylanase expression levels was weak. This suggests that the differences in expression levels between the gfp and xynA constructs were due to differences in 5'-end mRNA folding and consequential differences in the rates of translation initiation. Our data show that the use of large libraries of SEMs, in combination with high-throughput technologies, is a powerful approach to improve the production of a specific protein, but that the outcome cannot necessarily be extrapolated to other proteins.