A descriptive review of variable selection methods in four epidemiologic journals: there is still room for improvement.
Denis TalbotVictoria Kubuta MassambaPublished in: European journal of epidemiology (2019)
A review of epidemiological papers conducted in 2009 concluded that several studies employed variable selection methods susceptible to introduce bias and yield inadequate inferences. Many new confounder selection methods have been developed since then. The goal of the study was to provide an updated descriptive portrait of which variable selection methods are used by epidemiologists for analyzing observational data. Studies published in four major epidemiological journals in 2015 were reviewed. Only articles concerned with a predictive or explicative objective and reporting on the analysis of individual data were included. Method(s) employed for selecting variables were extracted from retained articles. A total of 975 articles were retrieved and 299 met eligibility criteria, 292 of which pursued an explicative objective. Among those, 146 studies (50%) reported using prior knowledge or causal graphs for selecting variables, 34 (12%) used change in effect estimate methods, 26 (9%) used stepwise approaches, 16 (5%) employed univariate analyses, 5 (2%) used various other methods and 107 (37%) did not provide sufficient details to allow classification (more than one method could be employed in a single article). Despite being less frequent than in the previous review, stepwise and univariable analyses, which are susceptible to introduce bias and produce inadequate inferences, were still prevalent. Moreover, 37% studies did not provide sufficient details to assess how variables were selected. We thus believe there is still room for improvement in variable selection methods used by epidemiologists and in their reporting.