Seminar 1: BrIAS Fellow Prof. Yilin Mo
Title: Data-Driven Learning of a Verifiable Controller Inspired by MPC
Abstract: Recent years have witnessed the development of learning-based control, many of which utilize general neural networks, such as Multi-Layer Perceptron (MLP), as the entirety or a part of the control policy. Despite their remarkable empirical performance, the existence of an even moderate size neural network makes it almost impossible to certify stability or provide performance guarantees. In this talk, we introduce a new class of learnable controller, drawing inspiration from Model Predictive Control (MPC). The controller resembles a Quadratic Programming (QP) solver of a linear MPC problem, and is differentiable with respect to its parameters, which enables the calculation of policy gradient and the usage of Deep Reinforcement Learning (DRL) to train the parameters, instead deriving them from a predictive model like in the MPC. Due to the structure imposed on the QP-based controller, one can verify its properties, such as persistent feasibility and asymptotic stability, using the same procedure as in the verification of MPC. On the other hand, numerical examples illustrate that the proposed controller empirically matches MPC and MLP controllers in terms of control performance and has superior robustness against modeling uncertainty and noises. Real-world experiments on vehicle drift maneuvering task demonstrate the potential of these controllers for robotics and other demanding control tasks.
Seminar 2: BrIAS Junior Fellow Dr. Yailen Martinez Jimenez
Title: Application of Reinforcement Learning in different scheduling scenarios
Abstract: Manufacturing scheduling is an optimization process that allocates limited manufacturing resources over time among parallel and sequential manufacturing activities. Customer orders have to be executed, and each order is composed by a number of operations that have to be processed on the available resources. Each order can have release and due dates associated, and typical objectives functions involve minimizing the tardiness or the makespan.
In real world scheduling problems, the environment is so dynamic that all this information is usually not known beforehand. For example, manufacturing scheduling is subject to constant uncertainty, machines break down, orders take longer than expected, and these unexpected events make the original schedule fail. That is the reason why companies prefer to have robust schedules rather than optimal ones. A key issue is to find a proper balance between these two performance measures. In this seminar, we discuss about a generic multi-agent reinforcement learning approach that can easily be adapted to different scheduling settings and objective functions. This approach also allows the user to specify certain parameters involved in the solution construction process, in order to define the balance between robustness and optimality.