Federated Learning of Electronic Health Records to Improve Mortality Prediction in Hospitalized Patients With COVID-19: Machine Learning Approach

Machine learning models require large datasets that may be siloed across different health care institutions. Machine learning studies that focus on COVID-19 have been limited to single-hospital data, which limits model generalizability. We aimed to use federated learning, a machine learning techniqu...

Full description

Saved in:
Bibliographic Details
Published in:JMIR medical informatics Vol. 9; no. 1; p. e24207
Main Authors: Vaid, Akhil, Jaladanki, Suraj K, Xu, Jie, Teng, Shelly, Kumar, Arvind, Lee, Samuel, Somani, Sulaiman, Paranjpe, Ishan, De Freitas, Jessica K, Wanyan, Tingyi, Johnson, Kipp W, Bicak, Mesude, Klang, Eyal, Kwon, Young Joon, Costa, Anthony, Zhao, Shan, Miotto, Riccardo, Charney, Alexander W, Böttinger, Erwin, Fayad, Zahi A, Nadkarni, Girish N, Wang, Fei, Glicksberg, Benjamin S
Format: Journal Article
Language:English
Published: Canada JMIR Publications 27-01-2021
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Machine learning models require large datasets that may be siloed across different health care institutions. Machine learning studies that focus on COVID-19 have been limited to single-hospital data, which limits model generalizability. We aimed to use federated learning, a machine learning technique that avoids locally aggregating raw clinical data across multiple institutions, to predict mortality in hospitalized patients with COVID-19 within 7 days. Patient data were collected from the electronic health records of 5 hospitals within the Mount Sinai Health System. Logistic regression with L1 regularization/least absolute shrinkage and selection operator (LASSO) and multilayer perceptron (MLP) models were trained by using local data at each site. We developed a pooled model with combined data from all 5 sites, and a federated model that only shared parameters with a central aggregator. The LASSO model outperformed the LASSO model at 3 hospitals, and the MLP model performed better than the MLP model at all 5 hospitals, as determined by the area under the receiver operating characteristic curve. The LASSO model outperformed the LASSO model at all hospitals, and the MLP model outperformed the MLP model at 2 hospitals. The federated learning of COVID-19 electronic health record data shows promise in developing robust predictive models without compromising patient privacy.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2291-9694
2291-9694
DOI:10.2196/24207