Mixed effect machine learning: A framework for predicting longitudinal change in hemoglobin A1c
[Display omitted] •Integrate random effects into standard machine learning algorithms.•Framework for longitudinal supervised learning with common machine learning models.•Developed interpretable tree based mixed-effect machine learning models.•Method prospectively identifies patients at risk for fut...
Saved in:
Published in: | Journal of biomedical informatics Vol. 89; pp. 56 - 67 |
---|---|
Main Authors: | , , , , |
Format: | Journal Article |
Language: | English |
Published: |
United States
Elsevier Inc
01-01-2019
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | [Display omitted]
•Integrate random effects into standard machine learning algorithms.•Framework for longitudinal supervised learning with common machine learning models.•Developed interpretable tree based mixed-effect machine learning models.•Method prospectively identifies patients at risk for future glycemic deterioration.
Accurate and reliable prediction of clinical progression over time has the potential to improve the outcomes of chronic disease. The classical approach to analyzing longitudinal data is to use (generalized) linear mixed-effect models (GLMM). However, linear parametric models are predicated on assumptions, which are often difficult to verify. In contrast, data-driven machine learning methods can be applied to derive insight from the raw data without a priori assumptions. However, the underlying theory of most machine learning algorithms assume that the data is independent and identically distributed, making them inefficient for longitudinal supervised learning. In this study, we formulate an analytic framework, which integrates the random-effects structure of GLMM into non-linear machine learning models capable of exploiting temporal heterogeneous effects, sparse and varying-length patient characteristics inherent in longitudinal data. We applied the derived mixed-effect machine learning (MEml) framework to predict longitudinal change in glycemic control measured by hemoglobin A1c (HbA1c) among well controlled adults with type 2 diabetes. Results show that MEml is competitive with traditional GLMM, but substantially outperformed standard machine learning models that do not account for random-effects. Specifically, the accuracy of MEml in predicting glycemic change at the 1st, 2nd, 3rd, and 4th clinical visits in advanced was 1.04, 1.08, 1.11, and 1.14 times that of the gradient boosted model respectively, with similar results for the other methods. To further demonstrate the general applicability of MEml, a series of experiments were performed using real publicly available and synthetic data sets for accuracy and robustness. These experiments reinforced the superiority of MEml over the other methods. Overall, results from this study highlight the importance of modeling random-effects in machine learning approaches based on longitudinal data. Our MEml method is highly resistant to correlated data, readily accounts for random-effects, and predicts change of a longitudinal clinical outcome in real-world clinical settings with high accuracy. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1532-0464 1532-0480 |
DOI: | 10.1016/j.jbi.2018.09.001 |