Reinforcement learning for intensive care medicine: actionable clinical insights from novel approaches to reward shaping and off-policy model evaluation.
Luca F RoggeveenAli El HassouniHarm-Jan de GroothArmand R J GirbesMark HoogendoornPaul W G Elbersnull nullPublished in: Intensive care medicine experimental (2024)
Cross-OPE can serve as a robust evaluation framework for safe RL model implementation by identifying policies with good generalisability. Policy restriction helps prevent potentially unsafe model recommendations. Finally, the novel delta-Q metric can be used to operationalise RL models in clinical practice. Our findings offer a promising pathway towards application of RL in intensive care medicine and beyond.