Chrome Extension
WeChat Mini Program
Use on ChatGLM

Efficient Multitask Reinforcement Learning Without Performance Loss.

IEEE transactions on neural networks and learning systems(2023)

Cited 1|Views0
No score
Abstract
We propose an iterative sparse Bayesian policy optimization (ISBPO) scheme as an efficient multitask reinforcement learning (RL) method for industrial control applications that require both high performance and cost-effective implementation. Under continual learning scenarios in which multiple control tasks are sequentially learned, the proposed ISBPO scheme preserves the previously learned knowledge without performance loss (PL), enables efficient resource use, and improves the sample efficiency of learning new tasks. Specifically, the proposed ISBPO scheme continually adds new tasks to a single policy neural network while completely preserving the control performance of previously learned tasks through an iterative pruning method. To create a free-weight space for adding new tasks, each task is learned through a pruning-aware policy optimization method called the sparse Bayesian policy optimization (SBPO), which ensures efficient allocation of limited policy network resources for multiple tasks. Furthermore, the weights allocated to the previous tasks are shared and reused in new task learning, thereby improving sample efficiency and the performance of new task learning. Simulations and practical experiments demonstrate that the proposed ISBPO scheme is highly suitable for sequentially learning multiple tasks in terms of performance conservation, efficient resource use, and sample efficiency.
More
Translated text
Key words
Deep reinforcement learning (RL),robotic control system,sequential multitask learning,sparse Bayesian learning (SBL)
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined