U-SPDNet: An SPD manifold learning-based neural network for visual classification

Rui Wang; Xiao-Jun Wu; Tianyang Xu; Cong Hu; Josef Kittler

doi:10.1016/j.neunet.2022.11.030

U-SPDNet: An SPD manifold learning-based neural network for visual classification

Neural Netw. 2023 Apr:161:382-396. doi: 10.1016/j.neunet.2022.11.030. Epub 2022 Dec 14.

Authors

Rui Wang¹, Xiao-Jun Wu², Tianyang Xu¹, Cong Hu¹, Josef Kittler³

Affiliations

¹ School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China; Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University, Wuxi 214122, China.
² School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China; Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University, Wuxi 214122, China. Electronic address: wu_xiaojun@jiangnan.edu.cn.
³ School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China; Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey, Guildford GU2 7XH, UK.

PMID: 36780861
DOI: 10.1016/j.neunet.2022.11.030

Abstract

With the development of neural networking techniques, several architectures for symmetric positive definite (SPD) matrix learning have recently been put forward in the computer vision and pattern recognition (CV&PR) community for mining fine-grained geometric features. However, the degradation of structural information during multi-stage feature transformation limits their capacity. To cope with this issue, this paper develops a U-shaped neural network on the SPD manifolds (U-SPDNet) for visual classification. The designed U-SPDNet contains two subsystems, one of which is a shrinking path (encoder) making up of a prevailing SPD manifold neural network (SPDNet (Huang and Van Gool, 2017)) for capturing compact representations from the input data. Another is a constructed symmetric expanding path (decoder) to upsample the encoded features, trained by a reconstruction error term. With this design, the degradation problem will be gradually alleviated during training. To enhance the representational capacity of U-SPDNet, we also append skip connections from encoder to decoder, realized by manifold-valued geometric operations, namely Riemannian barycenter and Riemannian optimization. On the MDSD, Virus, FPHA, and UAV-Human datasets, the accuracy achieved by our method is respectively 6.92%, 8.67%, 1.57%, and 1.08% higher than SPDNet, certifying its effectiveness.

Keywords: Neural network; Riemannian barycenter; Riemannian optimization; SPD manifold; Skip connection; Visual classification.

MeSH terms

Algorithms*
Artificial Intelligence
Humans
Neural Networks, Computer*