Machine learning in health informatics

Mavroudeas, Georgios
Thumbnail Image
Other Contributors
Bennett, Kristin P.
Kuruzovich, Jason N.
Gittens, Alex
Magdon-Ismail, Malik
Issue Date
Computer science
Terms of Use
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute (RPI), Troy, NY. Copyright of original work retained by author.
Full Citation
This dissertation concerns applications of Machine learning in health informatics. With an eye on personalized care as a main driver, we contribute on the following topics, (i) interpretable AI with applications in health (ii) scalable healthcare by mimicking the experts (iii) evaluations of health interventions. For interpretable analysis at observational data, we contribute two models, the Supervised Gaussian Mixture of Experts (SGMM) and the Supervised Bernoulli Mixture of Experts (SBMM). We demonstrate by applying them in the Statewide Planning and Research Cooperative System (SPARCS) data, their ability to outperform most of the black box machine learning models, while providing interpretable solutions as well as highly predictable subpopulations, which could be the basis for actionable policy. For scaling up the reach of healthcare experts, we consider complex care management (CCM). Complex care management aims to effectively assist patients to manage medical conditions and reduce hospitalizations. Doctors and health plan providers are responsible for assigning eligible patients to a CCM program. Some subjects eligible for the program don't get enrolled due to capacity constrains and overload of the physicians. We provide a decision support framework to assist doctors and health plan providers in this referral process. We view the medical condition of a patient as a sequence of "healthy" (H) and "sick" (S) events, where the sick condition indicates that the patient needs to be in CCM. Label bias, incorrect labels, and unbalanced classes are some of the challenges in using supervised learning to predict the H/S states. We solve this problem using a modification of hidden markov models, the HMM-BOOST model. The framework is general and can be applied in a multitude of problems of the same nature. We test the model with propriatery data from a local Health Maintenance Organization (HMO), Electroencephalogram (EEG) data from Boston Children's Hospital as well as in simulation using synhthetic data. For evaluation of health interventions, our primary goal is to develop a theory for estimating effect in non targeted trials, which is a common setting for health programs aiming to improve participants lifestyle by changing their everyday habits. We apply our methodology to evaluate a health program in proprietary data from a local HMO. In a non targeted trial the inclusion criteria are loose. This leads to application of the intervention widely producing a heterogeneous treated population. When most patients treated are healthy, this can obscure the true effects of an intervention. We develop an asymptotically consistent non parametric method (PCM) that provably recovers heterogeneous population effects in such a non targeted trial setting.
August 2022
School of Science
Dept. of Computer Science
Rensselaer Polytechnic Institute, Troy, NY
Rensselaer Theses and Dissertations Online Collection
Restricted to current Rensselaer faculty, staff and students in accordance with the Rensselaer Standard license. Access inquiries may be directed to the Rensselaer Libraries.