Transformer-Based Imitative Reinforcement Learning for Multirobot Path Planning

IEEE Transactions on Industrial Informatics(2023)

Cited 28|Views3
No score
Abstract
Multirobot path planning leads multiple robots from start positions to designated goal positions by generating efficient and collision-free paths. Multirobot systems realize coordination solutions and decentralized path planning, which is essential for large-scale systems. The state-of-the-art decentralized methods utilize imitation learning and reinforcement learning methods to teach fully decentralized policies, dramatically improving their performance. However, these methods cannot enable robots to perform tasks efficiently in relatively dense environments without communication between robots. We introduce the transformer structure into policy neural networks for the first time, dramatically enhancing the ability of policy neural networks to extract features that facilitate collaboration between robots. It mainly focuses on improving the performance of policies in relatively dense multirobot environments under conditions where robots do not communicate with each other. Furthermore, a novel imitation reinforcement learning framework is proposed by combining contrastive learning and double deep Q-network to solve the problem of difficulty training policy neural networks after introducing the transformer structure. We present results in the simulation environment and compare the resulting policy against advanced multirobot path-planning methods in terms of success rate. Simulation results show that our policy achieves state-of-the-art performance when there is no communication between robots. Finally, we experimented with a real-world case using a total of three robots in our robotic laboratory.
More
Translated text
Key words
Feature extraction,imitation learning,multirobot path planning (MRPP),reinforcement learning,robot learning,supervision contrastive learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined