Login / Signup

A Bayesian Null Interval Hypothesis Test Controls False Discovery Rates and Improves Sensitivity in Label-Free Quantitative Proteomics.

Robert J MillikinMichael R ShortreedMark ScalfLloyd M Smith
Published in: Journal of proteome research (2020)
Statistical significance tests are a common feature in quantitative proteomics workflows. The Student's t-test is widely used to compute the statistical significance of a protein's change between two groups of samples. However, the t-test's null hypothesis asserts that the difference in means between two groups is exactly zero, often marking small but uninteresting fold-changes as statistically significant. Compensations to address this issue are widely used in quantitative proteomics, but we suggest that a replacement of the t-test with a Bayesian approach offers a better path forward. In this article, we describe a Bayesian hypothesis test in which the null hypothesis is an interval rather than a single point at zero; the width of the interval is estimated from population statistics. The improved sensitivity of the method substantially increases the number of truly changing proteins detected in two benchmark data sets (ProteomeXchange identifiers PXD005590 and PXD016470). The method has been implemented within FlashLFQ, an open-source software program that quantifies bottom-up proteomics search results obtained from any search tool. FlashLFQ is rapid, sensitive, and accurate and is available both as an easy-to-use graphical user interface (Windows) and as a command-line tool (Windows/Linux/OSX).
Keyphrases
  • label free
  • mass spectrometry
  • high resolution
  • machine learning
  • small molecule
  • electronic health record
  • big data
  • quality improvement
  • sensitive detection