Reliable Machine Learning for Dynamic Healthcare under Distribution Shift, Missingness, and Decision Timing

Authors

  • Syed Asif Ali Department of Artificial Intelligence & Mathematical Sciences, Sindh Madressatul Islam University, Karachi Author
  • Chaesar Dewan Winata Program of Medical Laboratory Science, Universitas Muhammadiyah Semarang Author

DOI:

https://doi.org/10.70062/jmih.v1i2.167

Keywords:

Clinical Decision Support, Domain Adaptation, Electronic Health Records, Missing Data, Positive–Unlabeled Learning, Reinforcement Learning, Robustness, Temporal Distribution Shift

Abstract

Machine learning (ML) models are increasingly used in healthcare for risk prediction and decision support, but their performance often declines after deployment due to changes in patient populations, clinical practices, and data completeness. This study tackles three key challenges in reliable clinical ML: (1) temporal distribution shifts reducing generalizability, (2) underreporting and missing data biasing outcomes, and (3) sequential decision-making under cost and uncertainty. We propose an integrated framework comprising a temporal evaluation protocol to measure degradation over time, a domain adaptation method under missingness shift (DAMS) to enhance robustness with changing features, and a timing-aware reinforcement learning approach that considers when to intervene. Tested on seven large datasets, including SEER, MIMIC-IV, and CDC COVID-19, our methods improve calibration, robustness, and efficiency. For example, PU learning increased COVID-19 outcome prediction accuracy by 6–9%, DAMS reduced AUROC drop by almost 40%, and timing-aware RL achieved higher rewards with lower observation costs. These results show static evaluations underestimate deployment risk and that temporally aware, missingness-adaptive, and timing-sensitive methods enhance clinical decision-making. This is the first study to unify PU learning, DAMS, and timing-aware RL across real-world datasets, establishing a foundation for robust ML in healthcare.

References

[1] I. A. Okwor, G. Hitch, S. Hakkim, S. Akbar, D. Sookhoo, and J. Kainesie, “Digital Technologies Impact on Healthcare Delivery: A Systematic Review of Artificial Intelligence (AI) and Machine-Learning (ML) Adoption, Challenges, and Opportunities,” AI, vol. 5, no. 4, p. 95, Oct. 2024, doi: 10.3390/ai5040095.

[2] D. G. Poalelungi, C. L. Musat, A. Fulga, M. Neagu, A. I. Neagu, A. I. Piraianu, and I. Fulga, “Advancing Patient Care: How Artificial Intelligence Is Transforming Healthcare,” Journal of Personalized Medicine, vol. 13, no. 8, p. 1214, Jul. 2023, doi: 10.3390/jpm13081214.

[3] L. Guo, S. Pfohl, J. Fries, J. Posada, S. Fleming, C. Aftandilian, N. Shah, and L. Sung, “Systematic Review of Approaches to Preserve Machine Learning Performance in the Presence of Temporal Dataset Shift in Clinical Medicine,” Applied Clinical Informatics, vol. 12, no. 4, pp. 824–833, Aug. 2021, doi: 10.1055/s-0041-1735184.

[4] V. Subasri, A. Krishnan, A. Dhalla, D. Pandya, D. Malkin, F. Razak, A. A. Verma, A. Goldenberg, and E. Dolatabadi, “Diagnosing and remediating harmful data shifts for the responsible deployment of clinical AI models,” medRxiv, Mar. 2023, doi: 10.1101/2023.03.26.23286718.

[5] J. H. Shen, I. D. Raji, and I. Y. Chen, “The Data Addition Dilemma,” arXiv, Aug. 2024, doi: 10.48550/arxiv.2408.04154.

[6] V. Nguyen, C. Shui, V. Giri, S. Arya, A. Verma, F. Razak, and R. G. Krishnan, “Reliably detecting model failures in deployment without labels,” arXiv, Jun. 2025, doi: 10.48550/arxiv.2506.05047.

[7] A. M. Rahmani, E. Yousefpoor, M. S. Yousefpoor, Z. Mehmood, A. Haider, M. Hosseinzadeh, and R. A. Naqvi, “Machine Learning

(ML) in Medicine: Review, Applications, and Challenges,” Mathematics, vol. 9, no. 22, p. 2970, Nov. 2021, doi: 10.3390/math9222970.

[8] P. D. Roy, U. G. Chowdhory, A. Dey, and D. H. Sagor, “AI and Machine Learning in Healthcare: Advancing Diagnostics, Personalized Treatment, and Predictive Modeling,” Preprints.org, Apr. 2025, doi: 10.20944/preprints202504.0007.v1.

[9] Y. Habchi, H. Kheddar, Y. Himeur, A. Belouchrani, E. Serpedin, F. Khelifi, and M. E. H. Chowdhury, “Advanced deep learning and large language models: Comprehensive insights for cancer detection,” Image and Vision Computing, vol. 142, p. 105495, May 2025, doi: 10.1016/j.imavis.2025.105495.

[10] Ł. Ledziński and G. Grześk, “Artificial Intelligence as an Emerging Tool for Cardiologists,” Medical Sciences Forum, vol. 2, no. 1, p. 14339, Apr. 2023, doi: 10.3390/ecb2023-14339.

[11] A. Peine, A. Hallawa, J. Bickenbach, G. Dartmann, L. B. Fazlic, A. Schmeink, G. Ascheid, C. Thiemermann, A. Schuppert, R. Kindle, L. Celi, G. Marx, and L. Martin, “Development and validation of a reinforcement learning algorithm to dynamically optimize mechanical ventilation in critical care,” npj Digital Medicine, vol. 4, no. 32, Feb. 2021, doi: 10.1038/s41746-021-00388-6.

[12] L. F. Roggeveen, A. el Hassouni, H.-J. de Grooth, A. R. J. Girbes, M. Hoogendoorn, and P. W. G. Elbers, “Reinforcement learning for intensive care medicine: actionable clinical insights from novel approaches to reward shaping and off-policy model evaluation,” Intensive Care Medicine Experimental, vol. 12, no. 1, p. 19, Mar. 2024, doi: 10.1186/s40635-024-00614-x.

[13] C. Yin, R. Liu, J. Caterino, and P. Zhang, “Deconfounding Actor-Critic Network with Policy Adaptation for Dynamic Treatment Regimes,” arXiv, May 2022, doi: 10.48550/arxiv.2205.09852.

[14] S. Banerjee, T. Chattopadhyay, S. Biswas, R. Banerjee, A. D. Choudhury, A. Pal, and U. Garain, “Towards Wide Learning: Experiments in Healthcare,” arXiv, Dec. 2016, doi: 10.48550/arxiv.1612.05730.

[15] K. Liao, W. Wang, A. Elibol, L. Meng, X. Zhao, and N. Y. Chong, “Does Deep Learning REALLY Outperform Non-deep Machine Learning for Clinical Prediction on Physiological Time Series?,” arXiv, Nov. 2022, doi: 10.48550/arxiv.2211.06034.

[16] M. Soliński, M. Lepek, A. Pater, K. Muter, P. Wiszniewski, D. Kokosińska, J. Salamon, and Z. Puzio, “12-lead ECG Arrhythmia Classification Using Convolutional Neural Network for Mutually Non-Exclusive Classes,” in Computing in Cardiology Conference (CinC), Jan. 2020, pp. 1–4. doi: 10.22489/cinc.2020.124.

Downloads

Published

2025-09-19