Login / Signup

Asymptotic versus exact methods in the analysis of contingency tables: Evidence-based practical recommendations.

Miguel Ángel García-PérezVicente Núñez-Antón
Published in: Statistical methods in medical research (2020)
Controversy over the validity of significance tests in the analysis of contingency tables is motivated by the disagreement between asymptotic and exact p values and its dependence on the magnitude of expected frequencies. Variants of Pearson's X2 statistic and their asymptotic distributions were proposed to overcome the difficulties, but several approaches also exist to conduct exact tests. This paper shows that discrepant asymptotic and exact results may or may not occur whether expected frequencies are large or small: Eventual inaccuracy of asymptotic p values is instead caused by idiosyncrasies of the discrete distribution of X2. More importantly, discrepancies are also artificially created by the hypergeometric sampling model used to perform exact tests. Exact computations under the alternative full-multinomial or product-multinomial models require eliminating nuisance parameters and we propose a novel method that integrates them out. The resultant exact distributions are very accurately approximated by the asymptotic distribution, which eliminates concerns about the accuracy of the latter. We also discuss that the two-stage approach that tests for significance of residuals conditional on a significant X2 test is inadvisable and that an alternative single-stage test preserves Type-I error rates and further eliminates concerns about asymptotic accuracy.
Keyphrases
  • density functional theory
  • molecular dynamics
  • dna methylation