Lunds Tekniska Högskola


Msc. by Olhager: Robust Reinforcement Learning Control of a Furuta Pendulum

A rotary inverted pendulum


Tid: 2021-10-22 09:30 till 10:15
Plats: Seminar Room KC 3N27
Kontakt: johan [dot] gronqvist [at] control [dot] lth [dot] se
Spara händelsen till din kalender

Abstract: Safety-critical systems is an interesting application for Reinforcement Learning-based controllers. In such systems robustness is of high importance, as well as ways to guarantee and certify alevel of robustness. This project investigates the Projected Gradient Descent as an adversary for robust Deep Reinforcement Learning, which has not been done before. The Lipschitz constant of the controller's neural network is also evaluated as a measure of the controller's robustness. This is done by training an agent to perform swing-up and balancing of a Furuta pendulum, and further training it with Projected Gradient Descent of varying magnitudes as an adversary. The agents are evaluated by their robustness towards normally-distributed measurement noise as well as their estimated Lipschitz constant. The results show that naively training with PGD does not result in an increase in robustness. One out of 30 agents managed to outperform the baseline agent, indicating that there might be some promise here if further fine-tuning of the training process is done. Further, the Lipschitz constant did not correlate with robustness performance, indicating that it may not be an ideal measure of a neural network's robustness.

The presentation will be live on Zoom through:
                  Meeting ID: 619 0082 7229
                  Password: 923765