Human-likeness of antibody biologics determined by back-translation and comparison with large antibody variable gene repertoires.
Samuel SchmitzCinque SotoJames E CroweJens MeilerPublished in: mAbs (2021)
The antibody (Ab) germline gene rearrangement of variable (V), diversity (D), and joining (J) gene segments, as well as somatic hypermutation, give rise to the human Ab variable gene sequence repertoire. It is common to characterize single nucleotide frequencies of the variable region by alignment to species-specific wildtype germline genes. The increasing application of next-generation sequencing to immune repertoire studies has led to the compilation of increasing large adaptive immunome receptor repertoire datasets. We have developed a method that maps the sequence of a target Ab onto an immunome dataset of 326 million human Ab sequences. For this purpose, we created a position- and gene-specific scoring matrix (PGSSM) and its corresponding antibody similarity score. We characterized our PGSSM score and found that it strongly correlated with the phylogenetic distance of 181,355 Ab sequences from GenBank across 20 species. The most likely human nucleotide back-translation was obtained given only PGSSMs and the amino acid sequence of an Ab achieving a nucleotide sequence recovery of 95.9% and 97.2% for human heavy and light chains, respectively. In conclusion, the scoring of our back-translation is a valuable estimate for the similarity of an Ab sequence to the natural human repertoire. As expected, Ab therapeutic molecules developed from a human source showed a higher similarity to the repertoire than engineered Abs. Thus, the PGSSM metric introduced here can be used to engineer human-like Ab therapeutics.