An automated approach for real-time informative frames classification in laryngeal endoscopy using deep learning

Chiara Baldini; Muhammad Adeel Azam; Claudio Sampieri; Alessandro Ioppi; Laura Ruiz-Sevilla; Isabel Vilaseca; Berta Alegre; Alessandro Tirrito; Alessia Pennacchi; Giorgio Peretti; Sara Moccia; Leonardo S Mattos

doi:10.1007/s00405-024-08676-z

An automated approach for real-time informative frames classification in laryngeal endoscopy using deep learning

Eur Arch Otorhinolaryngol. 2024 Aug;281(8):4255-4264. doi: 10.1007/s00405-024-08676-z. Epub 2024 May 2.

Authors

Chiara Baldini^{1

2}, Muhammad Adeel Azam^{1

2}, Claudio Sampieri^{3

4

5}, Alessandro Ioppi⁶, Laura Ruiz-Sevilla⁷, Isabel Vilaseca^{8

9

10

11}, Berta Alegre^{8

9}, Alessandro Tirrito^{12

13}, Alessia Pennacchi^{12

13}, Giorgio Peretti^{12

13}, Sara Moccia¹⁴, Leonardo S Mattos¹

Affiliations

¹ Department of Advanced Robotics, Istituto Italiano di Tecnologia, Genoa, Italy.
² Departement of Informatics, Bioengineering, Robotics and Systems Engineering, University of Genoa, Genoa, Italy.
³ Department of Experimental Medicine (DIMES), University of Genoa, Genoa, Italy. claudio.sampieri@outlook.com.
⁴ Department of Otolaryngology, Hospital Clínic, C. de Villarroel, 170, 08029, Barcelona, Spain. claudio.sampieri@outlook.com.
⁵ Unit of Head and Neck Tumors, Hospital Clínic, Barcelona, Spain. claudio.sampieri@outlook.com.
⁶ Unit of Otolaryngology, Trento, Italy.
⁷ Otorhinolaryngology Head-Neck Surgery Department, Hospital Universitari Joan XXIII de Tarragona, Tarragona, Spain.
⁸ Department of Otolaryngology, Hospital Clínic, C. de Villarroel, 170, 08029, Barcelona, Spain.
⁹ Unit of Head and Neck Tumors, Hospital Clínic, Barcelona, Spain.
¹⁰ Translational Genomics and Target Therapies in Solid Tumors Group, Institut d́Investigacions Biomèdiques August Pi i Sunyer, IDIBAPS, Barcelona, Spain.
¹¹ Faculty of Medicine, University of Barcelona, Barcelona, Spain.
¹² Unit of Otorhinolaryngology-Head and Neck Surgery, IRCCS Ospedale Policlinico San Martino, Genoa, Italy.
¹³ Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, Genoa, Italy.
¹⁴ The BioRobotics Institute and Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Pisa, Italy.

Abstract

Purpose: Informative image selection in laryngoscopy has the potential for improving automatic data extraction alone, for selective data storage and a faster review process, or in combination with other artificial intelligence (AI) detection or diagnosis models. This paper aims to demonstrate the feasibility of AI in providing automatic informative laryngoscopy frame selection also capable of working in real-time providing visual feedback to guide the otolaryngologist during the examination.

Methods: Several deep learning models were trained and tested on an internal dataset (n = 5147 images) and then tested on an external test set (n = 646 images) composed of both white light and narrow band images. Four videos were used to assess the real-time performance of the best-performing model.

Results: ResNet-50, pre-trained with the pretext strategy, reached a precision = 95% vs. 97%, recall = 97% vs, 89%, and the F1-score = 96% vs. 93% on the internal and external test set respectively (p = 0.062). The four testing videos are provided in the supplemental materials.

Conclusion: The deep learning model demonstrated excellent performance in identifying diagnostically relevant frames within laryngoscopic videos. With its solid accuracy and real-time capabilities, the system is promising for its development in a clinical setting, either autonomously for objective quality control or in conjunction with other algorithms within a comprehensive AI toolset aimed at enhancing tumor detection and diagnosis.

Keywords: Artificial intelligence; Deep learning; Laryngeal cancer; Laryngoscopy; Larynx.

MeSH terms

Deep Learning*
Feasibility Studies
Humans
Laryngeal Diseases / diagnosis
Laryngeal Diseases / diagnostic imaging
Laryngoscopy* / methods
Video Recording