Login / Signup

Computing the noncentral-F distribution and the power of the F-test with guaranteed accuracy.

Ali BaharevHermann SchichlEndre Rév
Published in: Computational statistics (2016)
The computations involving the noncentral-F distribution are notoriously difficult to implement properly in floating-point arithmetic: Catastrophic loss of precision, floating-point underflow and overflow, drastically increasing computation time and program hang-ups, and instability due to numerical cancellation have all been reported. It is therefore recommended that existing statistical packages are cross-checked, and the present paper proposes a numerical algorithm precisely for this purpose. To the best of our knowledge, the proposed method is the first method that can compute the noncentrality parameter of the noncentral-F distribution with guaranteed accuracy over a wide parameter range that spans the range relevant for practical applications. Although the proposed method is limited to cases where the the degree of freedom of the denominator of the F test statistic is even, it does not affect its usefulness significantly: All of those algorithmic failures and inaccuracies that we can still reproduce today could have been prevented by simply cross-checking against the proposed method. Two numerical examples are presented where the intermediate computations went wrong silently, but the final result of the computations seemed nevertheless plausible, and eventually erroneous results were published. Cross-checking against the proposed method would have caught the numerical errors in both cases. The source code of the algorithm is available on GitHub, together with self-contained command-line executables. These executables can read the data to be cross-checked from plain text files, making it easy to cross-check any statistical software in an automated fashion.
Keyphrases
  • machine learning
  • healthcare
  • deep learning
  • randomized controlled trial
  • systematic review
  • single molecule
  • artificial intelligence
  • smoking cessation
  • electronic health record
  • data analysis