Login / Signup

Benchmarking tools for detecting longitudinal differential expression in proteomics data allows establishing a robust reproducibility optimization regression approach.

Tommi VälikangasTomi SuomiCourtney E ChandlerAlison J ScottBao Q TranRobert K ErnstDavid R GoodlettLaura L Elo
Published in: Nature communications (2022)
Quantitative proteomics has matured into an established tool and longitudinal proteomics experiments have begun to emerge. However, no effective, simple-to-use differential expression method for longitudinal proteomics data has been released. Typically, such data is noisy, contains missing values, and has only few time points and biological replicates. To address this need, we provide a comprehensive evaluation of several existing differential expression methods for high-throughput longitudinal omics data and introduce a Robust longitudinal Differential Expression (RolDE) approach. The methods are evaluated using over 3000 semi-simulated spike-in proteomics datasets and three large experimental datasets. In the comparisons, RolDE performs overall best; it is most tolerant to missing values, displays good reproducibility and is the top method in ranking the results in a biologically meaningful way. Furthermore, RolDE is suitable for different types of data with typically unknown patterns in longitudinal expression and can be applied by non-experienced users.
Keyphrases
  • electronic health record
  • mass spectrometry
  • cross sectional
  • big data
  • high throughput
  • label free
  • poor prognosis
  • data analysis
  • machine learning
  • high resolution
  • long non coding rna