Identifying the Last Universal Common Ancestor's protein domains resolves the order in which the amino acids were recruited into the genetic code.
Sawsan WehbiAndrew L WheelerBenoit MorelBui Quang MinhDante S LaurettaJoanna MaselPublished in: bioRxiv : the preprint server for biology (2024)
We identified protein domains that emerged early in the history of life. Protein domains whose ancestors date back to a single homolog in the Last Universal Common Ancestor (LUCA) remain depleted for amino acids believed to be added late to the genetic code. Notable exceptions call for revisions to our understanding of the order of amino acid recruitment into the genetic code. Enrichment in ancient proteins shows that metal-binding amino acids (cysteine and histidine) and sulfur-containing amino acids (cysteine and methionine) were added much earlier than previously thought. Sequences that had already diversified into multiple distinct copies in LUCA will tend to be even more ancient, and we therefore expected them to be more enriched for early amino acids, and depleted for late. Surprisingly, these more ancient sequences showed a different pattern, significantly less depleted for tryptophan and tyrosine, and enriched rather than depleted for phenylalanine. This is compatible with at least some of these sequences predating the current genetic code. Their distinct enrichment patterns thus provide hints about earlier, alternative genetic codes.