CW-PRED: Prediction of C-terminal surface anchoring sorting signals in bacteria and Archaea.
Aikaterini G ChatziargyriEvangelia A StasiKonstantinos D TsirigosZoi I LitouVassiliki A IconomidouPantelis G BagosPublished in: Journal of bioinformatics and computational biology (2024)
Sorting signals are crucial for the anchoring of proteins to the cell surface in archaea and bacteria. These proteins often feature distinct motifs at their C-terminus, cleaved by sortase or sortase-like enzymes. Gram-positive bacteria exhibit the LPXTGX consensus motif, cleaved by sortases, while Gram-negative bacteria employ exosortases recognizing motifs like PEP. Archaea utilize exosortase homologs known as archaeosortases for signal anchoring. Traditionally identification of such C-terminal sorting signals was performed with profile Hidden Markov Models (pHMMs). The C ell- W all PRE Diction (CW-PRED) method introduced for the first time a custom-made class HMM for proteins in Gram-positive bacteria that contain a cell wall sorting signal which begins with an LPXTG motif, followed by a hydrophobic domain and a tail of positively charged residues. Here we present a new and updated version of CW-PRED for predicting C-terminal sorting signals in Archaea, Gram-positive, and Gram-negative bacteria. We used a large training set and several model enhancements that improve motif identification in order to achieve better discrimination between C-terminal signals and other proteins. Cross-validation demonstrates CW-PRED's superiority in sensitivity and specificity compared to other methods. Application of the method in reference proteomes reveals a large number of potential surface proteins not previously identified. The method is available for academic use at http://195.251.108.230/apps.compgen.org/CW-PRED/ and as standalone software.