Optimal trade-off control in machine learning-based library design, with application to adeno-associated virus (AAV) for gene therapy.
Danqing ZhuDavid H BrookesAkosua BusiaAna CarneiroClara FannjiangGalina PopovaDavid ShinKevin C DonohueLi F LinZachary M MillerEvan R WilliamsEdward F ChangTomasz J NowakowskiJennifer ListgartenDavid V SchafferPublished in: Science advances (2024)
Adeno-associated viruses (AAVs) hold tremendous promise as delivery vectors for gene therapies. AAVs have been successfully engineered-for instance, for more efficient and/or cell-specific delivery to numerous tissues-by creating large, diverse starting libraries and selecting for desired properties. However, these starting libraries often contain a high proportion of variants unable to assemble or package their genomes, a prerequisite for any gene delivery goal. Here, we present and showcase a machine learning (ML) method for designing AAV peptide insertion libraries that achieve fivefold higher packaging fitness than the standard NNK library with negligible reduction in diversity. To demonstrate our ML-designed library's utility for downstream engineering goals, we show that it yields approximately 10-fold more successful variants than the NNK library after selection for infection of human brain tissue, leading to a promising glial-specific variant. Moreover, our design approach can be applied to other types of libraries for AAV and beyond.