Background: The differential diagnosis of malignant pleural effusion (MPE) and benign pleural effusion (BPE) presents a clinical challenge. In recent years, the use of artificial intelligence (AI) machine learning models for disease diagnosis has increased.
Objective: This study aimed to develop and validate a diagnostic model for early differentiation between MPE and BPE based on routine laboratory data.
Design: This was a retrospective observational cohort study.
Methods: A total of 2352 newly diagnosed patients with pleural effusion (PE), between January 2008 and March 2021, were eventually enrolled. Among them, 1435, 466, and 451 participants were randomly assigned to the training, validation, and testing cohorts in a ratio of 3:1:1. Clinical parameters, including age, sex, and laboratory parameters of PE patients, were abstracted for analysis. Based on 81 candidate laboratory variables, five machine learning models, namely extreme gradient boosting (XGBoost) model, logistic regression (LR) model, random forest (RF) model, support vector machine (SVM) model, and multilayer perceptron (MLP) model were developed. Their respective diagnostic performances for MPE were evaluated by receiver operating characteristic (ROC) curves.
Results: Among the five models, the XGBoost model exhibited the best diagnostic performance for MPE (area under the curve (AUC): 0.903, 0.918, and 0.886 in the training, validation, and testing cohorts, respectively). Additionally, the XGBoost model outperformed carcinoembryonic antigen (CEA) levels in pleural fluid (PF), serum, and the PF/serum ratio (AUC: 0.726, 0.699, and 0.692 in the training cohort; 0.763, 0.695, and 0.731 in the validation cohort; and 0.722, 0.729, and 0.693 in the testing cohort, respectively). Furthermore, compared with CEA, the XGBoost model demonstrated greater diagnostic power and sensitivity in diagnosing lung cancer-induced MPE.
Conclusion: The development of a machine learning model utilizing routine laboratory biomarkers significantly enhances the diagnostic capability for distinguishing between MPE and BPE. The XGBoost model emerges as a valuable tool for the diagnosis of MPE.
Keywords: benign pleural effusion; carcinoembryonic antigen; differential diagnosis; machine learning model; malignant pleural effusion.