Login / Signup

Costs and Benefits of Popular P -Value Correction Methods in Three Models of Quantitative Omic Experiments.

Steven R ShukenM Windy McNerney
Published in: Analytical chemistry (2023)
The multiple hypothesis testing problem is inherent in large-scale quantitative "omic" experiments such as mass spectrometry-based proteomics. Yet, tools for comparing the costs and benefits of different p -value correction methods under different experimental conditions are lacking. We performed thousands of simulations of omic experiments under a range of experimental conditions and applied correction using the Benjamini-Hochberg (BH), Bonferroni, and permutation-based false discovery proportion (FDP) estimation methods. The tremendous false discovery rate (FDR) benefit of correction was confirmed in a range of different contexts. No correction method can guarantee a low FDP in a single experiment, but the probability of a high FDP is small when a high number and proportion of corrected p -values are significant. On average, correction decreased sensitivity, but the sensitivity costs of BH and permutation were generally modest compared to the FDR benefits. In a given experiment, observed sensitivity was always maintained or decreased by BH and Bonferroni, whereas it was often increased by permutation. Overall, permutation had better FDR and sensitivity than BH. We show how increasing sample size, decreasing variability, or increasing effect size can enable the detection of all true changes while still correcting p -values, and we present basic guidelines for omic experimental design. Analysis of an experimental proteomic data set with defined changes corroborated these trends. We developed an R Shiny web application for further exploration and visualization of these models, which we call the Simulator of P -value Multiple Hypothesis Correction (SIMPLYCORRECT) and a high-performance R package, permFDP, for easy use of the permutation-based FDP estimation method.
Keyphrases
  • mass spectrometry
  • small molecule
  • high resolution
  • label free
  • high throughput
  • liquid chromatography
  • big data
  • monte carlo