TCR clustering by contrastive learning on antigen specificity.
Margarita PertsevaOcéane M FollonierDaniele ScarcellaSai T ReddyPublished in: Briefings in bioinformatics (2024)
Effective clustering of T-cell receptor (TCR) sequences could be used to predict their antigen-specificities. TCRs with highly dissimilar sequences can bind to the same antigen, thus making their clustering into a common antigen group a central challenge. Here, we develop TouCAN, a method that relies on contrastive learning and pretrained protein language models to perform TCR sequence clustering and antigen-specificity predictions. Following training, TouCAN demonstrates the ability to cluster highly dissimilar TCRs into common antigen groups. Additionally, TouCAN demonstrates TCR clustering performance and antigen-specificity predictions comparable to other leading methods in the field.