Login / Signup

Signal Peptide Efficiency: From High-Throughput Data to Prediction and Explanation.

Stefano GrassoValentina DabeneMargriet M W B HendriksPriscilla ZwartjensRené PellauxMartin HeldSven PankeJan Maarten van DijlAndreas MeyerTjeerd van Rij
Published in: ACS synthetic biology (2023)
The passage of proteins across biological membranes via the general secretory (Sec) pathway is a universally conserved process with critical functions in cell physiology and important industrial applications. Proteins are directed into the Sec pathway by a signal peptide at their N-terminus. Estimating the impact of physicochemical signal peptide features on protein secretion levels has not been achieved so far, partially due to the extreme sequence variability of signal peptides. To elucidate relevant features of the signal peptide sequence that influence secretion efficiency, an evaluation of ∼12,000 different designed signal peptides was performed using a novel miniaturized high-throughput assay. The results were used to train a machine learning model, and a post-hoc explanation of the model is provided. By describing each signal peptide with a selection of 156 physicochemical features, it is now possible to both quantify feature importance and predict the protein secretion levels directed by each signal peptide. Our analyses allow the detection and explanation of the relevant signal peptide features influencing the efficiency of protein secretion, generating a versatile tool for the de novo design and in silico evaluation of signal peptides.
Keyphrases
  • high throughput
  • machine learning
  • amino acid
  • stem cells
  • small molecule
  • artificial intelligence
  • protein protein
  • big data