Exploring vision transformers and XGBoost as deep learning ensembles for transforming carcinoma recognition

Akella Subrahmanya Narasimha Raju; K Venkatesh; B Padmaja; C H N Santhosh Kumar; Pattabhi Rama Mohan Patnala; Ayodele Lasisi; Saiful Islam; Abdul Razak; Wahaj Ahmad Khan

doi:10.1038/s41598-024-81456-1

Exploring vision transformers and XGBoost as deep learning ensembles for transforming carcinoma recognition

Sci Rep. 2024 Dec 3;14(1):30052. doi: 10.1038/s41598-024-81456-1.

Authors

Akella Subrahmanya Narasimha Raju¹, K Venkatesh², B Padmaja³, C H N Santhosh Kumar⁴, Pattabhi Rama Mohan Patnala⁵, Ayodele Lasisi⁶, Saiful Islam⁷, Abdul Razak⁸, Wahaj Ahmad Khan⁹

Affiliations

¹ Department of Computer Science and Engineering (Data Science), Institute of Aeronautical Engineering, Dundigul, Hyderabad, Telangana, 500043, India. a.raju@iare.ac.in.
² Department of Networking and Communications, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, Tamilnadu, 603203, India.
³ Department of Computer Science and Engineering-AI&ML, Institute of Aeronautical Engineering, Dundigal, Hyderabad, 500043, India.
⁴ Department of Computer Science and Engineering, Anurag Engineering College, Kodada, Telangana, 508206, India.
⁵ Department of Computer Applications, Aditya University, Surampalem, 533437, Andhra Pradesh, India.
⁶ Department of Computer Science, College of Computer Science, King Khalid University, Abha, Saudi Arabia.
⁷ Civil Engineering Department, College of Engineering, King Khalid University, 61421, Abha, Saudi Arabia.
⁸ Department of Mechanical Engineering, P. A. College of Engineering (Affiliated to Visvesvaraya Technological UniversityBelagavi), Mangaluru, India.
⁹ School of Civil Engineering & Architecture, Institute of Technology, Dire-Dawa University, 1362, Dire Dawa, Ethiopia. wkhan9450@gmail.com.

PMID: 39627293
DOI: 10.1038/s41598-024-81456-1

Abstract

Early detection of colorectal carcinoma (CRC), one of the most prevalent forms of cancer worldwide, significantly enhances the prognosis of patients. This research presents a new method for improving CRC detection using a deep learning ensemble with the Computer Aided Diagnosis (CADx). The method involves combining pre-trained convolutional neural network (CNN) models, such as ADaRDEV2I-22, DaRD-22, and ADaDR-22, using Vision Transformers (ViT) and XGBoost. The study addresses the challenges associated with imbalanced datasets and the necessity of sophisticated feature extraction in medical image analysis. Initially, the CKHK-22 dataset comprised 24 classes. However, we refined it to 14 classes, which led to an improvement in data balance and quality. This improvement enabled more precise feature extraction and improved classification results. We created two ensemble models: the first model used Vision Transformers to capture long-range spatial relationships in the images, while the second model combined CNNs with XGBoost to facilitate structured data classification. We implemented DCGAN-based augmentation to enhance the dataset's diversity. The tests showed big improvements in performance, with the ADaDR-22 + Vision Transformer group getting the best results, with a testing accuracy of 93.4% and an AUC of 98.8%. In contrast, the ADaDR-22 + XGBoost model had an AUC of 97.8% and an accuracy of 92.2%. These findings highlight the efficacy of the proposed ensemble models in detecting CRC and highlight the importance of using well-balanced, high-quality datasets. The proposed method significantly enhances the clinical diagnostic accuracy and the capabilities of medical image analysis or early CRC detection.

Keywords: CKHK-22 dataset; Colorectal Carcinoma (CRC); Ensemble models; Integrated CNNs; Vision Transformers; XGBoost.

MeSH terms

Colorectal Neoplasms* / diagnosis
Databases, Factual
Deep Learning*
Diagnosis, Computer-Assisted / methods
Early Detection of Cancer / methods
Humans
Image Processing, Computer-Assisted / methods
Neural Networks, Computer