Assessing Versatile Machine Learning Models for Glioma Radiogenomic Studies across Hospitals

Radiogenomics use non-invasively obtained imaging data, such as magnetic resonance imaging (MRI), to predict critical biomarkers of patients. Developing an accurate machine learning (ML) technique for MRI requires data from hundreds of patients, which cannot be gathered from any single local hospita...

Full description

Saved in:

Bibliographic Details
Published in:	Cancers Vol. 13; no. 14; p. 3611
Main Authors:	Kawaguchi, Risa K., Takahashi, Masamichi, Miyake, Mototaka, Kinoshita, Manabu, Takahashi, Satoshi, Ichimura, Koichi, Hamamoto, Ryuji, Narita, Yoshitaka, Sese, Jun
Format:	Journal Article
Language:	English
Published:	Basel MDPI AG 19-07-2021 MDPI
Subjects:	Accuracy Annotations Brain cancer Classification Datasets Deep learning Edema Gene expression Glioblastoma Glioma Hospitals Learning algorithms Machine learning Magnetic resonance imaging Medical prognosis Mutation Patients Predictions Standardization Tumors
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Radiogenomics use non-invasively obtained imaging data, such as magnetic resonance imaging (MRI), to predict critical biomarkers of patients. Developing an accurate machine learning (ML) technique for MRI requires data from hundreds of patients, which cannot be gathered from any single local hospital. Hence, a model universally applicable to multiple cohorts/hospitals is required. We applied various ML and image pre-processing procedures on a glioma dataset from The Cancer Image Archive (TCIA, n = 159). The models that showed a high level of accuracy in predicting glioblastoma or WHO Grade II and III glioma using the TCIA dataset, were then tested for the data from the National Cancer Center Hospital, Japan (NCC, n = 166) whether they could maintain similar levels of high accuracy. Results: we confirmed that our ML procedure achieved a level of accuracy (AUROC = 0.904) comparable to that shown previously by the deep-learning methods using TCIA. However, when we directly applied the model to the NCC dataset, its AUROC dropped to 0.383. Introduction of standardization and dimension reduction procedures before classification without re-training improved the prediction accuracy obtained using NCC (0.804) without a loss in prediction accuracy for the TCIA dataset. Furthermore, we confirmed the same tendency in a model for IDH1/2 mutation prediction with standardization and application of dimension reduction that was also applicable to multiple hospitals. Our results demonstrated that overfitting may occur when an ML method providing the highest accuracy in a small training dataset is used for different heterogeneous data sets, and suggested a promising process for developing an ML method applicable to multiple cohorts.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Co-first authors owing to their equal participation in this study.
ISSN:	2072-6694 2072-6694
DOI:	10.3390/cancers13143611