Chrome Extension
WeChat Mini Program
Use on ChatGLM

Blending Controllers via Multi-Objective Bandits

Parham Gohari, Franck Djeumou, Abraham P. Vinod, Ufuk Topcu

2022 American Control Conference (ACC)(2022)

Cited 0|Views20
No score
Abstract
Performance and safety are often two competing objectives in decision-making problems. We study the problem of integrating a collection of controllers with different safety and performance levels into one that takes a middle-ground position amongst them. In the first contribution, we formulate the problem of blending controllers using the framework of constrained Markov decision processes and contextual multi-objective bandits. We use the reward function and the auxiliary costs of the Markov decision process to measure the performance and the safety of a controller, respectively. We subsequently use these measures to form the feedback of a bandit whose arms are the input controllers. The blending algorithm must interact with the bandit and minimize a regret term that measures the suboptimality of the pulled arms with respect to an expert whose choice of arms is Pareto optimal. In the second contribution, we design a blending algorithm and show that its average regret converges to zero. We also derive an upper bound on the algorithm's suboptimality in performance and safety and we show that its computation imposes no additional computational complexity. We empirically demonstrate the algorithm's success in blending a safe and a performant controller in a variety of Safety Gym environments. The results reflect the following key takeaway: the blended controller shows a strict improvement in performance compared to the safe controller and is safer than the performant controller.
More
Translated text
Key words
input controllers,blending algorithm,performant controller,blended controller,safe controller,blending controllers,competing objectives,decision-making problems,middle-ground position,constrained Markov decision processes,contextual multiobjective bandits,safety gym environments
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined