Extracting Mutant-Affected Protein-Protein Interactions via Gaussian-Enhanced Representation and Contrastive Learning.
Da LiuYijia ZhangMing YangJianyuan YuanWen QuPublished in: Journal of computational biology : a journal of computational molecular cell biology (2023)
Genetic mutations can impact protein-protein interactions (PPIs) in biomedical literature. Automated extraction of PPIs affected by gene mutations from biomedical literature can aid in evaluating the clinical importance of gene variations, which is crucial for the advancement of precision medicine. In this study, a new model called the Gaussian-enhanced representation model (GRM) is introduced for PPI extraction. The model utilizes the Gaussian probability distribution to produce a target entity representation based on the BioBERT pretraining model. The GRM assigns more weight to target protein entities and their adjacent entities, resolving the problem of lengthy input text and scattered distribution of target entities in the PPI extraction task. Additionally, the model introduces a supervised contrast learning approach to enhance its effectiveness and robustness. Experiments on the BioCreative VI data set demonstrate that our proposed GRM model has achieved state-of-the-art performance.