Identification of COVID-19 Clinical Phenotypes by Principal Component Analysis-Based Cluster Analysis

Front Med (Lausanne). 2020 Nov 12:7:570614. doi: 10.3389/fmed.2020.570614. eCollection 2020.

Abstract

Background: COVID-19 has been quickly spreading, making it a serious public health threat. It is important to identify phenotypes to predict the severity of disease and design an individualized treatment. Methods: We collected data from 213 COVID-19 patients in Wuhan Pulmonary Hospital from January 1 to March 30, 2020. Principal component analysis (PCA) and cluster analysis were used to classify patients. Results: We identified three distinct subgroups of COVID-19. Cluster 1 was the largest group (52.6%) and characterized by oldest age, lowest cellular immune function, and albumin levels. 38.5% of subjects were grouped into Cluster 2. Most of the lab results in Cluster 2 fell between those of Clusters 1 and 3. Cluster 3 was the smallest cluster (8.9%), characterized by youngest age and highest cellular immune function. The incidence of respiratory failure, acute respiratory distress syndrome (ARDS), heart failure, and usage of non-invasive mechanical ventilation in Cluster 1 was significantly higher than others (P < 0.05). Cluster 1 had the highest death rate of 30.4% (P = 0.005). Although there were significant differences in age between Clusters 2 and 3 (P < 0.001), we found that there was no difference in demand for medical resources. Conclusions: We identified three distinct clusters of the COVID-19 patients. The results show that age alone could not be used to assess a patient's condition. Specifically, management of albumin, and immune function are important in reducing the severity of disease.

Keywords: COVID-19; cluster analysis; phenotype; principal component analysis; treatment.