Comparison of Empirical and Reinforcement Learning (RL)-Based Control Based on Proximal Policy Optimization (PPO) for Walking Assistance: Does AI Always Win?

Nadine Drewing; Arjang Ahmadi; Xiaofeng Xiong; Maziar Ahmad Sharbafi

doi:10.3390/biomimetics9110665

Comparison of Empirical and Reinforcement Learning (RL)-Based Control Based on Proximal Policy Optimization (PPO) for Walking Assistance: Does AI Always Win?

Biomimetics (Basel). 2024 Nov 1;9(11):665. doi: 10.3390/biomimetics9110665.

Authors

Nadine Drewing¹, Arjang Ahmadi¹, Xiaofeng Xiong², Maziar Ahmad Sharbafi¹

Affiliations

¹ Department of Human Science, Institute of Sport, Technical University of Darmstadt, 64289 Darmstadt, Germany.
² SDU Biorobotics, The Mærisk Mc-Kinney Møller Institute, University of Southern Denmark (SDU), 5230 Odense, Denmark.

Abstract

The use of wearable assistive devices is growing in both industrial and medical fields. Combining human expertise and artificial intelligence (AI), e.g., in human-in-the-loop-optimization, is gaining popularity for adapting assistance to individuals. Amidst prevailing assertions that AI could surpass human capabilities in customizing every facet of support for human needs, our study serves as an initial step towards such claims within the context of human walking assistance. We investigated the efficacy of the Biarticular Thigh Exosuit, a device designed to aid human locomotion by mimicking the action of the hamstrings and rectus femoris muscles using Serial Elastic Actuators. Two control strategies were tested: an empirical controller based on human gait knowledge and empirical data and a control optimized using Reinforcement Learning (RL) on a neuromuscular model. The performance results of these controllers were assessed by comparing muscle activation in two assisted and two unassisted walking modes. Results showed that both controllers reduced hamstring muscle activation and improved the preferred walking speed, with the empirical controller also decreasing gastrocnemius muscle activity. However, the RL-based controller increased muscle activity in the vastus and rectus femoris, indicating that RL-based enhancements may not always improve assistance without solid empirical support.

Keywords: PPO; exo control; exosuit; reinforcement learning; wearable assistive device.

Grants and funding

450821862/Deutsche Forschungsgemeinschaft