Unbiased Protein Association Study on the Public Human Proteome Reveals Biological Connections between Co-Occurring Protein Pairs.
Surya GuptaKenneth VerheggenJan TavernierLennart MartensPublished in: Journal of proteome research (2017)
Mass-spectrometry-based, high-throughput proteomics experiments produce large amounts of data. While typically acquired to answer specific biological questions, these data can also be reused in orthogonal ways to reveal new biological knowledge. We here present a novel method for such orthogonal data reuse of public proteomics data. Our method elucidates biological relationships between proteins based on the co-occurrence of these proteins across human experiments in the PRIDE database. The majority of the significantly co-occurring protein pairs that were detected by our method have been successfully mapped to existing biological knowledge. The validity of our novel method is substantiated by the extremely few pairs that can be mapped to existing knowledge based on random associations between the same set of proteins. Moreover, using literature searches and the STRING database, we were able to derive meaningful biological associations for unannotated protein pairs that were detected using our method, further illustrating that as-yet unknown associations present highly interesting targets for follow-up analysis.
Keyphrases
- mass spectrometry
- healthcare
- electronic health record
- high throughput
- big data
- endothelial cells
- protein protein
- mental health
- systematic review
- amino acid
- single cell
- data analysis
- high resolution
- wastewater treatment
- gene expression
- machine learning
- induced pluripotent stem cells
- deep learning
- artificial intelligence
- simultaneous determination
- capillary electrophoresis