Unconstrained generation of synthetic antibody-antigen structures to guide machine learning methodology for antibody specificity prediction.
Philippe A RobertRahmad AkbarRobert FrankMilena PavlovićMichael WidrichIgor SnapkovAndrei SlabodkinMaria ChernigovskayaLonneke SchefferEva SmorodinaPuneet RawatBrij Bhushan MehtaMai Ha VuIngvild Frøberg MathisenAurél PrószKrzysztof AbramAlex OlarEnkelejda MihoDag Trygve Tryslew HaugFridtjof Lund-JohansenSepp HochreiterIngrid Hobæk HaffGünter KlambauerGeir Kjetil F SandveVictor GreiffPublished in: Nature computational science (2022)
Machine learning (ML) is a key technology for accurate prediction of antibody-antigen binding. Two orthogonal problems hinder the application of ML to antibody-specificity prediction and the benchmarking thereof: the lack of a unified ML formalization of immunological antibody-specificity prediction problems and the unavailability of large-scale synthetic datasets to benchmark real-world relevant ML methods and dataset design. Here we developed the Absolut! software suite that enables parameter-based unconstrained generation of synthetic lattice-based three-dimensional antibody-antigen-binding structures with ground-truth access to conformational paratope, epitope and affinity. We formalized common immunological antibody-specificity prediction problems as ML tasks and confirmed that for both sequence- and structure-based tasks, accuracy-based rankings of ML methods trained on experimental data hold for ML methods trained on Absolut!-generated data. The Absolut! framework has the potential to enable real-world relevant development and benchmarking of ML strategies for biotherapeutics design.