Login / Signup

Machine Learning for The Prediction of Ranked Applicants and Matriculants to an Internal Medicine Residency Program.

Christiaan A ReesHilary F Ryder
Published in: Teaching and learning in medicine (2022)
Phenomenon : Residency programs throughout the country each receive hundreds to thousands of applications every year. Holistic review of this many applications is challenging, and to-date, few tools exist to streamline or assist in the process for selecting candidates to interview and rank. Machine learning could assist programs in predicting which applicants are likely to be ranked, and among ranked applicants, which are likely to matriculate. Approach : In the present study, we used the machine learning algorithm Random Forest (RF) to differentiate between ranked and unranked applicants as well as matriculants and ranked non-matriculants to an internal medicine residency program in northern New England over a three-year period. In total, 5,067 ERAS applications were received during the 2016-17, 2017-18, and 2018-19 application cycles. Of these, 4,256 (84.0%) were unranked applicants, 754 (14.9%) were ranked non-matriculants, and 57 (1.12%) were ranked matriculants. Findings : For differentiating between ranked and unranked applicants, the RF algorithm achieved an area under the receiver operating characteristic (AUROC) curve of 0.925 (95% CI: 0.918-0.932) and area under the precision-recall curve (AUPRC) of 0.652 (0.611-0.685), while for differentiating between matriculants and ranked non-matriculants, the AUROC was 0.597 (95% CI: 0.516-0.680) and AUPRC was 0.114 (0.075-0.167). The ranks of matriculated applicants were significantly higher using the algorithmic rank list as compared with the actual rank list for the 2017-18 (median rank: 98 versus 204, p < .001) and 2018-19 cycles (74 versus 192, p = .006), but not the 2016-17 cycle (97 versus 144, p = .37). Insights : The RF algorithm predicted which applicants among the overall applicant pool were ranked with impressive accuracy and identified matriculants among ranked candidates with modest but better-than-random accuracy. This approach could assist residency programs with triaging applicants based on the likelihood of a candidate being ranked and/or matriculating.
Keyphrases
  • machine learning
  • deep learning
  • public health
  • artificial intelligence
  • magnetic resonance imaging
  • computed tomography
  • magnetic resonance
  • neural network
  • contrast enhanced