Accelerated optimization in deep learning with a proportional-integral-derivative controller

Nat Commun. 2024 Nov 26;15(1):10263. doi: 10.1038/s41467-024-54451-3.

Abstract

High-performance optimization algorithms are essential in deep learning. However, understanding the behavior of optimization (i.e., learning process) remains challenging due to the instability and weak interpretability of the algorithms. Since gradient-based optimizations can be interpreted as continuous-time dynamical systems, applying feedback control to the dynamical systems that model the optimizers may provide another perspective for exploring more robust, accurate and explainable optimization algorithms. In this study, we present a framework for optimization called controlled heavy-ball optimizer. By employing the proportional-integral-derivative (PID) controller in the optimizer, we develop a deterministic continuous-time optimizer called Proportional-Integral-Derivative Accelerated Optimizer (PIDAO), and provide theoretical convergence analysis of PIDAO in unconstrained (non-)convex optimizations. As a byproduct, we derive PIDAO-family schemes for training deep neural networks by using specific discretization methods. Compared to classical optimizers, PIDAO can be empirically proven a more aggressive capacity to explore the loss landscape with lower computational costs due to the property of PID controller. Experimental evaluations demonstrate that PIDAO can accelerate the convergence and enhance the accuracy of deep learning, achieving state-of-the-art performance compared with advanced algorithms.