Login / Signup

Outliers may not be automatically removed.

Julian D Karch
Published in: Journal of experimental psychology. General (2023)
Researchers often remove outliers when comparing groups. It is well documented that the common practice of removing outliers within groups leads to inflated Type I error rates. However, it was recently argued by André (2022) that if outliers are instead removed across groups, Type I error rates are not inflated. The same study discusses that removing outliers across groups is a specific case of the more general concept of hypothesis-blind removal of outliers, which is consequently recommended. In this paper, I demonstrate that, contrary to this advice, hypothesis-blind outlier removal is problematic. Specifically, it almost always invalidates confidence intervals and biases estimates if there are group differences. It moreover inflates Type I error rates in certain situations, for example, when variances are unequal and data nonnormal. Consequently, a data point may not be removed solely because it is deemed an outlier, whether the procedure used is hypothesis-blind or hypothesis-aware. I conclude by recommending valid alternatives. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Keyphrases
  • healthcare
  • electronic health record
  • primary care
  • big data
  • emergency department
  • machine learning
  • deep learning
  • data analysis