Polymer design via SHAP and Bayesian machine learning optimizes pDNA and CRISPR ribonucleoprotein delivery.
Rishad J DalalFelipe OviedoMichael C LeydenTheresa M ReinekePublished in: Chemical science (2024)
We present the facile synthesis of a clickable polymer library with systematic variations in length, binary composition, p K a , and hydrophobicity (clog P ) to optimize intracellular pDNA and CRISPR-Cas9 ribonucleoprotein (RNP) performance. We couple physicochemical characterization and machine learning to interpret quantitative structure-property relationships within the combinatorial design space. For the first time, we reveal unexpected disparate design parameters for nucleic acid carriers; via explainable machine learning on 432 formulations, we discover that lower polymer p K a and higher percentages of benzimidazole ethanethiol enhance pDNA delivery, yet polymer length and captamine cation identity improve RNP delivery. Closed-loop Bayesian optimization of 552 formulation ratios further enhances in vitro performance. The top three polymers yield a higher signal and stable transgene expression over 20 days in vivo , and a 1.7-fold enhancement over controls. Our facile coupling of synthesis, characterization, and machine analysis provides powerful tools to quantitate performance parameters accelerating next-generation vehicles for nucleic acid medicines.