Login / Signup

Many nonnormalities, one simulation: Do different data generation algorithms affect study results?

Amanda J FairchildYunhang YinAmanda N BaraldiOscar Lorenzo Olvera AstiviaDexin Shi
Published in: Behavior research methods (2024)
Monte Carlo simulation studies are among the primary scientific outputs contributed by methodologists, guiding application of various statistical tools in practice. Although methodological researchers routinely extend simulation study findings through follow-up work, few studies are ever replicated. Simulation studies are susceptible to factors that can contribute to replicability failures, however. This paper sought to conduct a meta-scientific study by replicating one highly cited simulation study (Curran et al., Psychological Methods, 1, 16-29, 1996) that investigated the robustness of normal theory maximum likelihood (ML)-based chi-square fit statistics under multivariate nonnormality. We further examined the generalizability of the original study findings across different nonnormal data generation algorithms. Our replication results were generally consistent with original findings, but we discerned several differences. Our generalizability results were more mixed. Only two results observed under the original data generation algorithm held completely across other algorithms examined. One of the most striking findings we observed was that results associated with the independent generator (IG) data generation algorithm vastly differed from other procedures examined and suggested that ML was robust to nonnormality for the particular factor model used in the simulation. Findings point to the reality that extant methodological recommendations may not be universally valid in contexts where multiple data generation algorithms exist for a given data characteristic. We recommend that researchers consider multiple approaches to generating a specific data or model characteristic (when more than one is available) to optimize the generalizability of simulation results.
Keyphrases