Beyond the traditional simulation design for evaluating type 1 error control: From the "theoretical" null to "empirical" null.

Published in: Genetic epidemiology (2018)

When evaluating a newly developed statistical test, an important step is to check its type 1 error (T1E) control using simulations. This is often achieved by the standard simulation design S0 under the so-called "theoretical" null of no association. In practice, the whole-genome association analyses scan through a large number of genetic markers ( <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>G</mml:mi></mml:math> s) for the ones associated with an outcome of interest ( <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>Y</mml:mi></mml:math> ), where <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>Y</mml:mi></mml:math> comes from an alternative while the majority of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>G</mml:mi></mml:math> s are not associated with <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>Y</mml:mi></mml:math> ; the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>Y</mml:mi> <mml:mo>-</mml:mo> <mml:mi>G</mml:mi></mml:math> relationships are under the "empirical" null. This reality can be better represented by two other simulation designs, where design S1.1 simulates <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>Y</mml:mi></mml:math> from analternative model based on <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>G</mml:mi></mml:math> , then evaluates its association with independently generated <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mrow><mml:mrow/> <mml:msub><mml:mi>G</mml:mi> <mml:mrow><mml:mi>n</mml:mi> <mml:mi>e</mml:mi> <mml:mi>w</mml:mi></mml:mrow> </mml:msub> </mml:mrow> </mml:math> ; while design S1.2 evaluates the association between permutated <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>Y</mml:mi></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>G</mml:mi></mml:math> . More than a decade ago, Efron (2004) has noted the important distinction between the "theoretical" and "empirical" null in false discovery rate control. Using scale tests for variance heterogeneity, direct univariate, and multivariate interaction tests as examples, here we show that not all null simulation designs are equal. In examining the accuracy of a likelihood ratio test, while simulation design S0 suggested the method being accurate, designs S1.1 and S1.2 revealed its increased empirical T1E rate if applied in real data setting. The inflation becomes more severe at the tail and does not diminish as sample size increases. This is an important observation that calls for new practices for methods evaluation and T1E control interpretation.

Keyphrases