Computer Vision Identification of Trachomatous Inflammation-Follicular Using Deep Learning

Ashlin S Joye; Marissa G Firlie; Dionna M Wittberg; Solomon Aragie; Scott D Nash; Zerihun Tadesse; Adane Dagnew; Dagnachew Hailu; Fisseha Admassu; Bilen Wondimteka; Habib Getachew; Endale Kabtu; Social Beyecha; Meskerem Shibiru; Banchalem Getnet; Tibebe Birhanu; Seid Abdu; Solomon Tekew; Thomas M Lietman; Jeremy D Keenan; Travis K Redd

doi:10.1097/ICO.0000000000003701

Computer Vision Identification of Trachomatous Inflammation-Follicular Using Deep Learning

Cornea. 2024 Sep 20. doi: 10.1097/ICO.0000000000003701. Online ahead of print.

Authors

Ashlin S Joye^{1

2}, Marissa G Firlie³, Dionna M Wittberg², Solomon Aragie⁴, Scott D Nash⁵, Zerihun Tadesse⁴, Adane Dagnew⁴, Dagnachew Hailu⁴, Fisseha Admassu⁶, Bilen Wondimteka⁶, Habib Getachew⁶, Endale Kabtu⁶, Social Beyecha⁶, Meskerem Shibiru⁶, Banchalem Getnet⁶, Tibebe Birhanu⁶, Seid Abdu⁶, Solomon Tekew⁶, Thomas M Lietman², Jeremy D Keenan², Travis K Redd^{1

2}

Affiliations

¹ Casey Eye Institute, Oregon Health and Science University, Portland, OR.
² Francis I Proctor Foundation, University of California San Francisco, San Francisco, CA.
³ George Washington University, School of Medicine and Health Sciences, Washington, DC.
⁴ The Carter Center Ethiopia, Addis Ababa, Ethiopia.
⁵ The Carter Center, Atlanta, GA; and.
⁶ Department of Ophthalmology, University of Gondar, Gondar, Ethiopia.

PMID: 39312712
DOI: 10.1097/ICO.0000000000003701

Abstract

Purpose: Trachoma surveys are used to estimate the prevalence of trachomatous inflammation-follicular (TF) to guide mass antibiotic distribution. These surveys currently rely on human graders, introducing a significant resource burden and potential for human error. This study describes the development and evaluation of machine learning models intended to reduce cost and improve reliability of these surveys.

Methods: Fifty-six thousand seven hundred twenty-five everted eyelid photographs were obtained from 11,358 children of age 0 to 9 years in a single trachoma-endemic region of Ethiopia over a 3-year period. Expert graders reviewed all images from each examination to determine the estimated number of tarsal conjunctival follicles and the degree of trachomatous inflammation-intense. The median estimate of the 3 grader groups was used as the ground truth to train a MobileNetV3 large deep convolutional neural network to detect cases with TF.

Results: The classification model predicted a TF prevalence of 32%, which was not significantly different from the human consensus estimate (30%; 95% confidence interval of difference, -2 to +4%). The model had an area under the receiver operating characteristic curve of 0.943, F1 score of 0.923, 88% accuracy, 83% sensitivity, and 91% specificity. The area under the receiver operating characteristic curve increased to 0.995 when interpreting nonborderline cases of TF.

Conclusions: Deep convolutional neural network models performed well at classifying TF and detecting the number of follicles evident in conjunctival photographs. Implementation of similar models may enable accurate, efficient, large-scale trachoma screening. Further validation in diverse populations with varying TF prevalence is needed before implementation at scale.

Abstract

Grants and funding