Analysis of Global Collection of Group A Streptococcus Genomes Reveals that the Majority Encode a Trio of M and M-Like Proteins.
Hannah R FrostMark R DaviesValérie DelforgeDalila LakhloufiMartina Sanderson-SmithVelusamy SrinivasanAndrew C SteerMark J WalkerBernard BeallAnne BotteauxPierre R SmeestersPublished in: mSphere (2020)
The core Mga (multiple gene activator) regulon of group A Streptococcus (GAS) contains genes encoding proteins involved in adhesion and immune evasion. While all GAS genomes contain genes for Mga and C5a peptidase, the intervening genes encoding M and M-like proteins vary between strains. The genetic make-up of the Mga regulon of GAS was characterized by utilizing a collection of 1,688 GAS genomes that are representative of the global GAS population. Sequence variations were examined with multiple alignments, and the expression of all core Mga regulon genes was examined by quantitative reverse transcription-PCR in a representative strain collection. In 85.2% of the sampled genomes, the Mga locus contained genes encoding Mga, Mrp, M, Enn, and C5a peptidase proteins. These isolates account for 53% of global infections. Only 9.1% of genomes did not contain either an mrp or an enn gene. The pairwise identity within Enn (68.6%) and Mrp (83.2%) protein sequences was higher than within M proteins (44.7%). Gene expression varied between strains tested, but high expression was recorded for all genes in at least one strain. Previous nomenclature issues were clarified with molecular gene definitions. Our findings support a shift in focus in the GAS research field to further consider the role of Mrp and Enn in virulence and vaccine development.IMPORTANCE While the GAS M protein has been the leading vaccine target for decades, the bacteria encode many other virulence factors of interest for vaccine development. In this work, we show that emm-like genes are encoded in a remarkable majority of GAS genomes and expressed at a level similar to that for the emm gene. In collaboration with the U.S. Centers for Disease Control, we developed molecular definitions of the different emm and emm-like gene families. This clarification should abrogate mistyping of strains, especially in the area of whole-genome typing. We have also updated the emm-typing collection by removing emm-like gene sequences and provided in-depth analysis of Mrp and Enn protein sequence structure and diversity.
Keyphrases
- genome wide identification
- genome wide
- genome wide analysis
- dna methylation
- transcription factor
- room temperature
- copy number
- escherichia coli
- gene expression
- biofilm formation
- bioinformatics analysis
- poor prognosis
- pseudomonas aeruginosa
- staphylococcus aureus
- cystic fibrosis
- high resolution
- protein protein
- immune response
- nuclear factor