CAMAP: Artificial neural networks unveil the role of codon arrangement in modulating MHC-I peptides presentation.
Tariq DaoudaMaude Dumont-LagacéAlbert FeghalyYahya BenslimaneRébecca PanesMathieu CourcellesMohamed BenhammadiLea HarringtonPierre ThibaultFrançois MajorYoshua BengioEtienne GagnonSébastien LemieuxClaude PerreaultPublished in: PLoS computational biology (2021)
MHC-I associated peptides (MAPs) play a central role in the elimination of virus-infected and neoplastic cells by CD8 T cells. However, accurately predicting the MAP repertoire remains difficult, because only a fraction of the transcriptome generates MAPs. In this study, we investigated whether codon arrangement (usage and placement) regulates MAP biogenesis. We developed an artificial neural network called Codon Arrangement MAP Predictor (CAMAP), predicting MAP presentation solely from mRNA sequences flanking the MAP-coding codons (MCCs), while excluding the MCC per se. CAMAP predictions were significantly more accurate when using original codon sequences than shuffled codon sequences which reflect amino acid usage. Furthermore, predictions were independent of mRNA expression and MAP binding affinity to MHC-I molecules and applied to several cell types and species. Combining MAP ligand scores, transcript expression level and CAMAP scores was particularly useful to increase MAP prediction accuracy. Using an in vitro assay, we showed that varying the synonymous codons in the regions flanking the MCCs (without changing the amino acid sequence) resulted in significant modulation of MAP presentation at the cell surface. Taken together, our results demonstrate the role of codon arrangement in the regulation of MAP presentation and support integration of both translational and post-translational events in predictive algorithms to ameliorate modeling of the immunopeptidome.
Keyphrases
- neural network
- high density
- amino acid
- stem cells
- high resolution
- case report
- machine learning
- cell surface
- poor prognosis
- mesenchymal stem cells
- oxidative stress
- binding protein
- high throughput
- cell proliferation
- endoplasmic reticulum stress
- cell cycle arrest
- bone marrow
- cell therapy
- genetic diversity
- long non coding rna