Comparing machine learning algorithms for multimorbidity prediction: An example from the Elsa-Brasil study

Background Multimorbidity is a worldwide concern related to greater disability, worse quality of life, and mortality. The early prediction is crucial for preventive strategies design and integrative medical practice. However, knowledge about how to predict multimorbidity is limited, possibly due to...

Full description

Saved in:
Bibliographic Details
Published in:PloS one Vol. 17; no. 10; p. e0275619
Main Authors: Polessa Paula, Daniela, Barbosa Aguiar, Odaleia, Pruner Marques, Larissa, Bensenor, Isabela, Suemoto, Claudia Kimie, Mendes da Fonseca, Maria de Jesus, Griep, Rosane Härter
Format: Journal Article
Language:English
Published: San Francisco Public Library of Science 07-10-2022
Public Library of Science (PLoS)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background Multimorbidity is a worldwide concern related to greater disability, worse quality of life, and mortality. The early prediction is crucial for preventive strategies design and integrative medical practice. However, knowledge about how to predict multimorbidity is limited, possibly due to the complexity involved in predicting multiple chronic diseases. Methods In this study, we present the use of a machine learning approach to build cost-effective multimorbidity prediction models. Based on predictors easily obtainable in clinical practice (sociodemographic, clinical, family disease history and lifestyle), we build and compared the performance of seven multilabel classifiers (multivariate random forest, and classifier chain, binary relevance and binary dependence, with random forest and support vector machine as base classifiers), using a sample of 15105 participants from the Brazilian Longitudinal Study of Adult Health (ELSA-Brasil). We developed a web application for the building and use of prediction models. Results Classifier chain with random forest as base classifier performed better (accuracy = 0.34, subset accuracy = 0.15, and Hamming Loss = 0.16). For different feature sets, random forest based classifiers outperformed those based on support vector machine. BMI, blood pressure, sex, and age were the features most relevant to multimorbidity prediction. Conclusions Our results support the choice of random forest based classifiers for multimorbidity prediction.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Competing Interests: The authors have declared that no competing interests exist.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0275619