Objective: Dementia is a significant medical and social issue in most developed countries. Practical tools for predicting the progression of degenerative dementia are highly valuable. Machine learning (ML) methods facilitate the construction of effective models using real-world data, which may include missing values and various integrated datasets.
Method: This retrospective study analyzed data from 679 patients diagnosed with degenerative dementia at Fu Jen Catholic University Hospital, who were evaluated by neurologists, psychologists and followed for over two years. Predictive variables were categorized into demographic (D), clinical dementia rating (CDR), mini-mental state examination (MMSE), and laboratory data value (LV) groups. These categories were further integrated into three subgroups (D-CDR, D-CDR-MMSE, and D-CDR-MMSE-LV). We utilized the extreme gradient boosting (XGB) model to rank the importance of variables and identify the most effective feature combination via a step-wise approach.
Result: The D-CDR-MMSE-LV model combination showed robust performance with an excellent area under the receiver operating characteristic curve (AUC) and the highest sensitivity value (84.66). Employing both demographic and neuropsychiatric variables, our prediction model achieved an AUC of 83.74. By incorporating additional clinical information from laboratory data and applying our proposed feature selection strategy, we constructed a model based on eight variables that achieved an AUC of 85.12 using the XGB technique.
Conclusion: We established a machine-learning model to monitor the progression of dementia using a limited, real-world clinical dataset. The XGB technique identified eight critical variables across our integrated datasets, potentially providing clinicians with valuable guidance.
Keywords: Dementia; Extreme gradient boosting; Machine learning; Prediction model.
© 2024. The Author(s).