“KAIZEN” method realizing implementation of deep-learning models for COVID-19 CT diagnosis in real world hospitals

Numerous COVID-19 diagnostic imaging Artificial Intelligence (AI) studies exist. However, none of their models were of potential clinical use, primarily owing to methodological defects and the lack of implementation considerations for inference. In this study, all development processes of the deep-l...

Full description

Saved in:
Bibliographic Details
Published in:Scientific reports Vol. 14; no. 1; p. 1672
Main Authors: Okada, Naoki, Umemura, Yutaka, Shi, Shoi, Inoue, Shusuke, Honda, Shun, Matsuzawa, Yohsuke, Hirano, Yuichiro, Kikuyama, Ayano, Yamakawa, Miho, Gyobu, Tomoko, Hosomi, Naohiro, Minami, Kensuke, Morita, Natsushiro, Watanabe, Atsushi, Yamasaki, Hiroyuki, Fukaguchi, Kiyomitsu, Maeyama, Hiroki, Ito, Kaori, Okamoto, Ken, Harano, Kouhei, Meguro, Naohito, Unita, Ryo, Koshiba, Shinichi, Endo, Takuro, Yamamoto, Tomonori, Yamashita, Tomoya, Shinba, Toshikazu, Fujimi, Satoshi
Format: Journal Article
Language:English
Published: London Nature Publishing Group UK 19-01-2024
Nature Publishing Group
Nature Portfolio
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Numerous COVID-19 diagnostic imaging Artificial Intelligence (AI) studies exist. However, none of their models were of potential clinical use, primarily owing to methodological defects and the lack of implementation considerations for inference. In this study, all development processes of the deep-learning models are performed based on strict criteria of the “KAIZEN checklist”, which is proposed based on previous AI development guidelines to overcome the deficiencies mentioned above. We develop and evaluate two binary-classification deep-learning models to triage COVID-19: a slice model examining a Computed Tomography (CT) slice to find COVID-19 lesions; a series model examining a series of CT images to find an infected patient. We collected 2,400,200 CT slices from twelve emergency centers in Japan. Area Under Curve (AUC) and accuracy were calculated for classification performance. The inference time of the system that includes these two models were measured. For validation data, the slice and series models recognized COVID-19 with AUCs and accuracies of 0.989 and 0.982, 95.9% and 93.0% respectively. For test data, the models’ AUCs and accuracies were 0.958 and 0.953, 90.0% and 91.4% respectively. The average inference time per case was 2.83 s. Our deep-learning system realizes accuracy and inference speed high enough for practical use. The systems have already been implemented in four hospitals and eight are under progression. We released an application software and implementation code for free in a highly usable state to allow its use in Japan and globally.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2045-2322
2045-2322
DOI:10.1038/s41598-024-52135-y