Login / Signup

Optimizing recombinant antibody fragment production: A comparison of artificial intelligence and statistical modeling.

Majid BasafaAtieh HashemiAidin Behravan
Published in: Biotechnology and applied biochemistry (2024)
Maximizing the recombinant protein yield necessitates optimizing the production medium. This can be done using a variety of methods, including the conventional "one-factor-at-a-time" approach and more recent statistical and mathematical methods such as artificial neural network (ANN), genetic algorithm, etc. Every approach has advantages and disadvantages of its own, yet even when a technique has flaws, it is nevertheless used to get the best results. Here, one categorical variable and four numerical parameters, including post-induction time, inducer concentration, post-induction temperature, and pre-induction cell density, were optimized using the 232 experimental assays of the central composite design. The direct and indirect effects of factors on the yield of anti-epithelial cell adhesion molecule extracellular domain fragment antibody were examined using statistical methods. The analysis of variance results indicate that the response surface methodology (RSM) model is effective in predicting the amount of produced single-chain fragment variable (p-value = 0.0001 and R 2 = 0.905). For ANN modeling, the evaluation using normalized root mean square error (NRMSE) and R 2 values shows a good fit (R 2 = 0.942) and accurate predictions (NRMSE = 0.145). The analysis of error parameters and R 2 of a dataset, which contained 30 data points randomly selected from the complete dataset, showed that the ANN model had a higher R 2 value (0.968) compared to the RSM model (0.932). Furthermore, the ANN model demonstrated stronger predictive ability with a lower NRMSE (0.048 vs. 0.064). Induction at the cell density of 0.7 and an isopropyl β-D-1-thiogalactopyranoside concentration of 0.6 mM for 32 h at 30°C in BW25113 was the ideal culture condition leading to the protein yield of 259.51 mg/L. Under the optimum conditions, the output values predicted by the ANN model (259.83 mg/L) were more in line with the experimental data (259.51 mg/L) than the RSM (276.13 mg/L) expected value. This outcome demonstrated that the ANN model outperforms the RSM in terms of prediction accuracy.
Keyphrases
  • neural network
  • artificial intelligence
  • machine learning
  • big data
  • single cell
  • stem cells
  • high throughput
  • binding protein
  • small molecule
  • data analysis