Deep learning detection of diabetic retinopathy in Scotland's diabetic eye screening programme

Support vector machine-based automated grading (known as iGradingM) has been shown to be safe, cost-effective and robust in the diabetic retinopathy (DR) screening (DES) programme in Scotland. It triages screening episodes as gradable with no DR versus manual grading required. The study aim was to d...

Full description

Saved in:

Bibliographic Details
Published in:	British journal of ophthalmology Vol. 108; no. 7; p. 984
Main Authors:	Fleming, Alan D, Mellor, Joseph, McGurnaghan, Stuart J, Blackbourn, Luke A K, Goatman, Keith A, Styles, Caroline, Storkey, Amos J, McKeigue, Paul M, Colhoun, Helen M
Format:	Journal Article
Language:	English
Published:	England 01-07-2024
Subjects:	Aged Algorithms Deep Learning Diabetic Retinopathy - diagnosis Diabetic Retinopathy - epidemiology Female Humans Male Mass Screening - methods Middle Aged Scotland - epidemiology Sensitivity and Specificity Scotland Retina Telemedicine Imaging Public health
Online Access:	Get more information
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Support vector machine-based automated grading (known as iGradingM) has been shown to be safe, cost-effective and robust in the diabetic retinopathy (DR) screening (DES) programme in Scotland. It triages screening episodes as gradable with no DR versus manual grading required. The study aim was to develop a deep learning-based autograder using images and gradings from DES and to compare its performance with that of iGradingM. Retinal images, quality assurance (QA) data and routine DR grades were obtained from national datasets in 179 944 patients for years 2006-2016. QA grades were available for 744 images. We developed a deep learning-based algorithm to detect whether either eye contained ungradable images or any DR. The sensitivity and specificity were evaluated against consensus QA grades and routine grades. Images used in QA which were ungradable or with DR were detected by deep learning with better specificity compared with manual graders (p<0.001) and with iGradingM (p<0.001) at the same sensitivities. Any DR according to the DES final grade was detected with 89.19% (270 392/303 154) sensitivity and 77.41% (500 945/647 158) specificity. Observable disease and referable disease were detected with sensitivities of 96.58% (16 613/17 201) and 98.48% (22 600/22 948), respectively. Overall, 43.84% of screening episodes would require manual grading. A deep learning-based system for DR grading was evaluated in QA data and images from 11 years in 50% of people attending a national DR screening programme. The system could reduce the manual grading workload at the same sensitivity compared with the current automated grading system.
ISSN:	1468-2079
DOI:	10.1136/bjo-2023-323395