Login / Signup

Practical Considerations and Applied Examples of Cross-Validation for Model Development and Evaluation in Health Care: Tutorial.

Drew WilimitisColin G Walsh
Published in: JMIR AI (2023)
Cross-validation remains a popular means of developing and validating artificial intelligence for health care. Numerous subtypes of cross-validation exist. Although tutorials on this validation strategy have been published and some with applied examples, we present here a practical tutorial comparing multiple forms of cross-validation using a widely accessible, real-world electronic health care data set: Medical Information Mart for Intensive Care-III (MIMIC-III). This tutorial explored methods such as K-fold cross-validation and nested cross-validation, highlighting their advantages and disadvantages across 2 common predictive modeling use cases: classification (mortality) and regression (length of stay). We aimed to provide readers with reproducible notebooks and best practices for modeling with electronic health care data. We also described sets of useful recommendations as we demonstrated that nested cross-validation reduces optimistic bias but comes with additional computational challenges. This tutorial might improve the community's understanding of these important methods while catalyzing the modeling community to apply these guides directly in their work using the published code.
Keyphrases
  • healthcare
  • artificial intelligence
  • machine learning
  • big data
  • mental health
  • primary care
  • randomized controlled trial
  • electronic health record
  • type diabetes