scAMAC: self-supervised clustering of scRNA-seq data based on adaptive multi-scale autoencoder

Brief Bioinform. 2024 Jan 22;25(2):bbae068. doi: 10.1093/bib/bbae068.

Abstract

Cluster assignment is vital to analyzing single-cell RNA sequencing (scRNA-seq) data to understand high-level biological processes. Deep learning-based clustering methods have recently been widely used in scRNA-seq data analysis. However, existing deep models often overlook the interconnections and interactions among network layers, leading to the loss of structural information within the network layers. Herein, we develop a new self-supervised clustering method based on an adaptive multi-scale autoencoder, called scAMAC. The self-supervised clustering network utilizes the Multi-Scale Attention mechanism to fuse the feature information from the encoder, hidden and decoder layers of the multi-scale autoencoder, which enables the exploration of cellular correlations within the same scale and captures deep features across different scales. The self-supervised clustering network calculates the membership matrix using the fused latent features and optimizes the clustering network based on the membership matrix. scAMAC employs an adaptive feedback mechanism to supervise the parameter updates of the multi-scale autoencoder, obtaining a more effective representation of cell features. scAMAC not only enables cell clustering but also performs data reconstruction through the decoding layer. Through extensive experiments, we demonstrate that scAMAC is superior to several advanced clustering and imputation methods in both data clustering and reconstruction. In addition, scAMAC is beneficial for downstream analysis, such as cell trajectory inference. Our scAMAC model codes are freely available at https://github.com/yancy2024/scAMAC.

Keywords: attention mechanism; fuzzy clustering; multi-scale autoencoder; self-supervised clustering; single-cell sequencing.

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Data Analysis*
  • Gene Expression Profiling
  • Sequence Analysis, RNA
  • Single-Cell Gene Expression Analysis*