Effect of a comprehensive deep-learning model on the accuracy of chest x-ray interpretation by radiologists: a retrospective, multireader multicase study

Chest x-rays are widely used in clinical practice; however, interpretation can be hindered by human error and a lack of experienced thoracic radiologists. Deep learning has the potential to improve the accuracy of chest x-ray interpretation. We therefore aimed to assess the accuracy of radiologists...

Full description

Saved in:

Bibliographic Details
Published in:	The Lancet (British edition) Vol. 3; no. 8; pp. e496 - e506
Main Authors:	Seah, Jarrel C Y, Tang, Cyril H M, Buchlak, Quinlan D, Holt, Xavier G, Wardman, Jeffrey B, Aimoldin, Anuar, Esmaili, Nazanin, Ahmad, Hassan, Pham, Hung, Lambert, John F, Hachey, Ben, Hogg, Stephen J F, Johnston, Benjamin P, Bennett, Christine, Oakden-Rayner, Luke, Brotchie, Peter, Jones, Catherine M
Format:	Journal Article
Language:	English
Published:	England Elsevier Ltd 01-08-2021 Elsevier B.V
Subjects:	Adolescent Adult Aged Aged, 80 and over Area Under Curve Artificial Intelligence Deep Learning Female Humans Infections - diagnosis Infections - diagnostic imaging Male Mass Screening - methods Middle Aged Models, Biological Radiographic Image Interpretation, Computer-Assisted Radiography, Thoracic Radiologists Retrospective Studies ROC Curve Thoracic Injuries - diagnosis Thoracic Injuries - diagnostic imaging Thoracic Neoplasms - diagnosis Thoracic Neoplasms - diagnostic imaging X-Rays Young Adult Australia
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Chest x-rays are widely used in clinical practice; however, interpretation can be hindered by human error and a lack of experienced thoracic radiologists. Deep learning has the potential to improve the accuracy of chest x-ray interpretation. We therefore aimed to assess the accuracy of radiologists with and without the assistance of a deep-learning model. In this retrospective study, a deep-learning model was trained on 821 681 images (284 649 patients) from five data sets from Australia, Europe, and the USA. 2568 enriched chest x-ray cases from adult patients (≥16 years) who had at least one frontal chest x-ray were included in the test dataset; cases were representative of inpatient, outpatient, and emergency settings. 20 radiologists reviewed cases with and without the assistance of the deep-learning model with a 3-month washout period. We assessed the change in accuracy of chest x-ray interpretation across 127 clinical findings when the deep-learning model was used as a decision support by calculating area under the receiver operating characteristic curve (AUC) for each radiologist with and without the deep-learning model. We also compared AUCs for the model alone with those of unassisted radiologists. If the lower bound of the adjusted 95% CI of the difference in AUC between the model and the unassisted radiologists was more than −0·05, the model was considered to be non-inferior for that finding. If the lower bound exceeded 0, the model was considered to be superior. Unassisted radiologists had a macroaveraged AUC of 0·713 (95% CI 0·645–0·785) across the 127 clinical findings, compared with 0·808 (0·763–0·839) when assisted by the model. The deep-learning model statistically significantly improved the classification accuracy of radiologists for 102 (80%) of 127 clinical findings, was statistically non-inferior for 19 (15%) findings, and no findings showed a decrease in accuracy when radiologists used the deep-learning model. Unassisted radiologists had a macroaveraged mean AUC of 0·713 (0·645–0·785) across all findings, compared with 0·957 (0·954–0·959) for the model alone. Model classification alone was significantly more accurate than unassisted radiologists for 117 (94%) of 124 clinical findings predicted by the model and was non-inferior to unassisted radiologists for all other clinical findings. This study shows the potential of a comprehensive deep-learning model to improve chest x-ray interpretation across a large breadth of clinical practice. Annalise.ai.
ISSN:	2589-7500 0140-6736 2589-7500
DOI:	10.1016/S2589-7500(21)00106-0