Login / Signup

A note on tests for relevant differences with extremely large sample sizes.

Andrea CallegaroCheikh NdourEmmanuel ArisCatherine Legrand
Published in: Biometrical journal. Biometrische Zeitschrift (2018)
A well-known problem in classical two-tailed hypothesis testing is that P-values go to zero when the sample size goes to infinity, irrespectively of the effect size. This pitfall can make the testing of data consisting of large sample sizes potentially unreliable. In this note, we propose to test for relevant differences to overcome this issue. We illustrate the proposed test a on real data set of about 40 million privately insured patients.
Keyphrases
  • end stage renal disease
  • electronic health record
  • newly diagnosed
  • chronic kidney disease
  • ejection fraction
  • big data
  • peritoneal dialysis
  • prognostic factors
  • machine learning
  • patient reported outcomes
  • data analysis