Comparative performances of machine learning algorithms in radiomics and impacting factors

There are no current recommendations on which machine learning (ML) algorithms should be used in radiomics. The objective was to compare performances of ML algorithms in radiomics when applied to different clinical questions to determine whether some strategies could give the best and most stable pe...

Full description

Saved in:

Bibliographic Details
Published in:	Scientific reports Vol. 13; no. 1; p. 14069
Main Authors:	Decoux, Antoine, Duron, Loic, Habert, Paul, Roblot, Victoire, Arsovic, Emina, Chassagnon, Guillaume, Arnoux, Armelle, Fournier, Laure
Format:	Journal Article
Language:	English
Published:	London Nature Publishing Group UK 28-08-2023 Nature Publishing Group Nature Portfolio
Subjects:	631/114 631/114/1305 631/114/1564 Algorithms Classification Computer Science COVID-19 Datasets Feature selection Humanities and Social Sciences Learning algorithms Machine learning Medical Imaging multidisciplinary Radiomics Regression analysis Sarcopenia Science Science (multidisciplinary) COVID-19 / diagnostic imaging Head Algorithms Machine Learning Humans Random Forest
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	There are no current recommendations on which machine learning (ML) algorithms should be used in radiomics. The objective was to compare performances of ML algorithms in radiomics when applied to different clinical questions to determine whether some strategies could give the best and most stable performances regardless of datasets. This study compares the performances of nine feature selection algorithms combined with fourteen binary classification algorithms on ten datasets. These datasets included radiomics features and clinical diagnosis for binary clinical classifications including COVID-19 pneumonia or sarcopenia on CT, head and neck, orbital or uterine lesions on MRI. For each dataset, a train-test split was created. Each of the 126 (9 × 14) combinations of feature selection algorithms and classification algorithms was trained and tuned using a ten-fold cross validation, then AUC was computed. This procedure was repeated three times per dataset. Best overall performances were obtained with JMI and JMIM as feature selection algorithms and random forest and linear regression models as classification algorithms. The choice of the classification algorithm was the factor explaining most of the performance variation (10% of total variance). The choice of the feature selection algorithm explained only 2% of variation, while the train-test split explained 9%.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	2045-2322 2045-2322
DOI:	10.1038/s41598-023-39738-7