CysPresso: a classification model utilizing deep learning protein representations to predict recombinant expression of cysteine-dense peptides.
Sébastien OuelletLarissa FergusonAngus Z LauTony K Y LimPublished in: BMC bioinformatics (2023)
Our novel model, CysPresso, is the first to successfully predict recombinant CDP expression in mammalian cells and is particularly well suited for predicting recombinant expression of knottin peptides. When preprocessing the deep learning protein representation for supervised machine learning, we found that random convolutional kernel transformation preserves more pertinent information relevant for predicting expressibility than embedding averaging. Our study showcases the applicability of deep learning-based protein representations, such as those provided by AlphaFold2, in tasks beyond structure prediction.