There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.
Abstract
Background
In recent years, artificial intelligence and machine learning algorithms have been
used more extensively to diagnose diabetic retinopathy and other diseases. Still,
the effectiveness of these methods has not been thoroughly investigated. This study
aimed to evaluate the performance and limitations of machine learning and deep learning
algorithms in detecting diabetic retinopathy.
Methods
This study was conducted based on the PRISMA checklist. We searched online databases,
including PubMed, Scopus, and Google Scholar, for relevant articles up to September
30, 2023. After the title, abstract, and full-text screening, data extraction and
quality assessment were done for the included studies. Finally, a meta-analysis was
performed.
Results
We included 76 studies with a total of 1,371,517 retinal images, of which 51 were
used for meta-analysis. Our meta-analysis showed a significant sensitivity and specificity
with a percentage of 90.54 (95%CI [90.42, 90.66],
P < 0.001) and 78.33% (95%CI [78.21, 78.45],
P < 0.001). However, the AUC (area under curvature) did not statistically differ across
studies, but had a significant figure of 0.94 (95% CI [− 46.71, 48.60],
P = 1).
Conclusions
Although machine learning and deep learning algorithms can properly diagnose diabetic
retinopathy, their discriminating capacity is limited. However, they could simplify
the diagnosing process. Further studies are required to improve algorithms.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12938-025-01336-1.
Globally, the number of people with diabetes mellitus has quadrupled in the past three decades, and diabetes mellitus is the ninth major cause of death. About 1 in 11 adults worldwide now have diabetes mellitus, 90% of whom have type 2 diabetes mellitus (T2DM). Asia is a major area of the rapidly emerging T2DM global epidemic, with China and India the top two epicentres. Although genetic predisposition partly determines individual susceptibility to T2DM, an unhealthy diet and a sedentary lifestyle are important drivers of the current global epidemic; early developmental factors (such as intrauterine exposures) also have a role in susceptibility to T2DM later in life. Many cases of T2DM could be prevented with lifestyle changes, including maintaining a healthy body weight, consuming a healthy diet, staying physically active, not smoking and drinking alcohol in moderation. Most patients with T2DM have at least one complication, and cardiovascular complications are the leading cause of morbidity and mortality in these patients. This Review provides an updated view of the global epidemiology of T2DM, as well as dietary, lifestyle and other risk factors for T2DM and its complications.
Deep learning is a family of computational methods that allow an algorithm to program itself by learning from a large set of examples that demonstrate the desired behavior, removing the need to specify rules explicitly. Application of these methods to medical imaging requires further assessment and validation.
Question How does a deep learning system (DLS) using artificial intelligence compare with professional human graders in identifying diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes? Findings In the primary validation dataset (71 896 images; 14 880 patients), the DLS had a sensitivity of 90.5% and specificity of 91.6% for detecting referable diabetic retinopathy; 100% sensitivity and 91.1% specificity for vision-threatening diabetic retinopathy; 96.4% sensitivity and 87.2% specificity for possible glaucoma; and 93.2% sensitivity and 88.7% specificity for age-related macular degeneration, compared with professional graders. Meaning The DLS had high sensitivity and specificity for identifying diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. Importance A deep learning system (DLS) is a machine learning technology with potential for screening diabetic retinopathy and related eye diseases. Objective To evaluate the performance of a DLS in detecting referable diabetic retinopathy, vision-threatening diabetic retinopathy, possible glaucoma, and age-related macular degeneration (AMD) in community and clinic-based multiethnic populations with diabetes. Design, Setting, and Participants Diagnostic performance of a DLS for diabetic retinopathy and related eye diseases was evaluated using 494 661 retinal images. A DLS was trained for detecting diabetic retinopathy (using 76 370 images), possible glaucoma (125 189 images), and AMD (72 610 images), and performance of DLS was evaluated for detecting diabetic retinopathy (using 112 648 images), possible glaucoma (71 896 images), and AMD (35 948 images). Training of the DLS was completed in May 2016, and validation of the DLS was completed in May 2017 for detection of referable diabetic retinopathy (moderate nonproliferative diabetic retinopathy or worse) and vision-threatening diabetic retinopathy (severe nonproliferative diabetic retinopathy or worse) using a primary validation data set in the Singapore National Diabetic Retinopathy Screening Program and 10 multiethnic cohorts with diabetes. Exposures Use of a deep learning system. Main Outcomes and Measures Area under the receiver operating characteristic curve (AUC) and sensitivity and specificity of the DLS with professional graders (retinal specialists, general ophthalmologists, trained graders, or optometrists) as the reference standard. Results In the primary validation dataset (n = 14 880 patients; 71 896 images; mean [SD] age, 60.2 [2.2] years; 54.6% men), the prevalence of referable diabetic retinopathy was 3.0%; vision-threatening diabetic retinopathy, 0.6%; possible glaucoma, 0.1%; and AMD, 2.5%. The AUC of the DLS for referable diabetic retinopathy was 0.936 (95% CI, 0.925-0.943), sensitivity was 90.5% (95% CI, 87.3%-93.0%), and specificity was 91.6% (95% CI, 91.0%-92.2%). For vision-threatening diabetic retinopathy, AUC was 0.958 (95% CI, 0.956-0.961), sensitivity was 100% (95% CI, 94.1%-100.0%), and specificity was 91.1% (95% CI, 90.7%-91.4%). For possible glaucoma, AUC was 0.942 (95% CI, 0.929-0.954), sensitivity was 96.4% (95% CI, 81.7%-99.9%), and specificity was 87.2% (95% CI, 86.8%-87.5%). For AMD, AUC was 0.931 (95% CI, 0.928-0.935), sensitivity was 93.2% (95% CI, 91.1%-99.8%), and specificity was 88.7% (95% CI, 88.3%-89.0%). For referable diabetic retinopathy in the 10 additional datasets, AUC range was 0.889 to 0.983 (n = 40 752 images). Conclusions and Relevance In this evaluation of retinal images from multiethnic cohorts of patients with diabetes, the DLS had high sensitivity and specificity for identifying diabetic retinopathy and related eye diseases. Further research is necessary to evaluate the applicability of the DLS in health care settings and the utility of the DLS to improve vision outcomes. This diagnostic accuracy study compares the performance of deep learning systems vs eye professionals for detecting referable and vision-threatening diabetic retinopathy, glaucoma, and other eye diseases in retinal images from Chinese, Indian, and Malaysian patients.
[1
]School of Medicine, Hormozgan University of Medical Sciences, (
https://ror.org/037wqsr57)
Bandar Abbas, Iran
[2
]Department and Faculty of Health Education and Health Promotion, Student Research
Committee, Shahid Beheshti University of Medical Sciences, (
https://ror.org/034m2b326)
Tehran, Iran
[3
]Student Research Committee, School of Medicine, Shahid Beheshti University of Medical
Science, (
https://ror.org/034m2b326)
Tehran, Iran
[4
]Student Research Committee, Zanjan University of Medical Sciences, (
https://ror.org/01xf7jb19)
Zanjan, Iran
[5
]Health Education and Promotion, Department of Community Medicine, School of Medicine,
Dezful University of Medical Sciences, (
https://ror.org/033hgcp80)
Dezful, Iran
[6
]Student Research Committee, Shahid Beheshti University of Medical Science, (
https://ror.org/034m2b326)
Arabi Ave, Daneshjoo Blvd, Velenjak, Tehran, 19839-63113 Iran
[7
]Faculty of Medicine, Islamic Azad University Tabriz Branch, (
https://ror.org/0032wgp28)
Tabriz, Iran
[8
]Faculty of Medicine, Guilan University of Medical Sciences, (
https://ror.org/04ptbrd12)
Rasht, Iran
[9
]Student Research Committee, Isfahan University of Medical Sciences, (
https://ror.org/04waqzz56)
Isfahan, Iran
[10
]School of Medicine, Shahid Beheshti University of Medical Sciences, (
https://ror.org/034m2b326)
Tehran, Iran
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives
4.0 International License, which permits any non-commercial use, sharing, distribution
and reproduction in any medium or format, as long as you give appropriate credit to
the original author(s) and the source, provide a link to the Creative Commons licence,
and indicate if you modified the licensed material. You do not have permission under
this licence to share adapted material derived from this article or parts of it. The
images or other third party material in this article are included in the article’s
Creative Commons licence, unless indicated otherwise in a credit line to the material.
If material is not included in the article’s Creative Commons licence and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will
need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit
http://creativecommons.org/licenses/by-nc-nd/4.0/.
scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.