Reinforcement Learning Assisted Oxygen Therapy for COVID-19 Patients Under Intensive Care.
Overview
abstract
BACKGROUND: Patients with severe Coronavirus disease 19 (COVID-19) typically require supplemental oxygen as an essential treatment. We developed a machine learning algorithm, based on deep Reinforcement Learning (RL), for continuous management of oxygen flow rate for critically ill patients under intensive care, which can identify the optimal personalized oxygen flow rate with strong potentials to reduce mortality rate relative to the current clinical practice. METHODS: We modeled the oxygen flow trajectory of COVID-19 patients and their health outcomes as a Markov decision process. Based on individual patient characteristics and health status, an optimal oxygen control policy is learned by using deep deterministic policy gradient (DDPG) and real-time recommends the oxygen flow rate to reduce the mortality rate. We assessed the performance of proposed methods through cross validation by using a retrospective cohort of 1,372 critically ill patients with COVID-19 from New York University Langone Health ambulatory care with electronic health records from April 2020 to January 2021. RESULTS: The mean mortality rate under the RL algorithm is lower than the standard of care by 2.57% (95% CI: 2.08-3.06) reduction (P<0.001) from 7.94% under the standard of care to 5.37 % under our proposed algorithm. The averaged recommended oxygen flow rate is 1.28 L/min (95% CI: 1.14-1.42) lower than the rate delivered to patients. Thus, the RL algorithm could potentially lead to better intensive care treatment that can reduce the mortality rate, while saving the oxygen scarce resources. It can reduce the oxygen shortage issue and improve public health during the COVID-19 pandemic. CONCLUSIONS: A personalized reinforcement learning oxygen flow control algorithm for COVID-19 patients under intensive care showed a substantial reduction in 7-day mortality rate as compared to the standard of care. In the overall cross validation cohort independent of the training data, mortality was lowest in patients for whom intensivists' actual flow rate matched the RL decisions.