Hold-out validation for the assessment of stability and reliability of multivariable regression demonstrated with magnetic resonance imaging of patients with schizophrenia.
Jacob LevmanMaxwell JenningsPriya KabariaEthan RouseMasahito NangakuDerek BergerIker GondraEmi TakahashiPascal TyrrellPublished in: International journal of developmental neuroscience : the official journal of the International Society for Developmental Neuroscience (2021)
Neuroscience studies are very often tasked with identifying measurable differences between two groups of subjects, typically one group with a pathological condition and one group representing control subjects. It is often expected that the measurements acquired for comparing groups are also affected by a variety of additional patient characteristics such as sex, age, and comorbidities. Multivariable regression (MVR) is a statistical analysis technique commonly employed in neuroscience studies to "control for" or "adjust for" secondary effects (such as sex, age, and comorbidities) in order to ensure that the main study findings are focused on actual differences between the groups of interest associated with the condition under investigation. It is common practice in the neuroscience literature to utilize MVR to control for secondary effects; however, at present, it is not typically possible to assess whether the MVR adjustments correct for more error than they introduce. In common neuroscience practice, MVR models are not validated and no attempt to characterize deficiencies in the MVR model is made. In this article, we demonstrate how standard hold-out validation techniques (commonly used in machine learning analyses) that involve repeatedly randomly dividing datasets into training and testing samples can be adapted to the assessment of stability and reliability of MVR models with a publicly available neurological magnetic resonance imaging (MRI) dataset of patients with schizophrenia. Results demonstrate that MVR can introduce measurement error up to 30.06% and, on average across all considered measurements, introduce 9.84% error on this dataset. When hold-out validated MVR does not agree with the results of the standard use of MVR, the use of MVR in the given application is unstable. Thus, this paper helps evaluate the extent to which the simplistic use of MVR introduces study error in neuroscientific analyses with an analysis of patients with schizophrenia.