谷歌浏览器插件
订阅小程序
在清言上使用

Multi-Reward Architecture Based Reinforcement Learning For Highway Driving Policies

2019 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC)(2019)

引用 11|浏览20
暂无评分
摘要
A safe and efficient driving policy is essential for the future autonomous highway driving. However, driving policies are hard for modeling because of the diversity of scenes and uncertainties of the interaction with surrounding vehicles. The state-of-the-art deep reinforcement learning method is unable to learn good domain knowledge for highway driving policies using single reward architecture. This paper proposes a Multi-Reward Architecture (MRA) based reinforcement learning for highway driving policies. A single reward function is decomposed to multi-reward functions for better representation of multi-dimensional driving policies. Besides the big penalty for collision, the overall reward is decomposed to three dimensional rewards: the reward for speed, the reward for overtake, and the reward for lane-change. Then, each reward trains a branch of Q-network for corresponding domain knowledge. The Q-network is divided into two parts: low-level network is shared by three branches of high-level networks, which approximate the corresponding Q-value for the different reward functions respectively. The agent car chooses the action based on the sum of Q vectors from three branches. Experiments are conducted in a simulation platform, which performs the highway driving process and the agent car is able to provide the commonly used sensor data: the image and the point cloud. Experiment results show that the proposed method performs better than the DQN method on single reward architecture with three evaluations: higher speed, lower frequency of lane-change, more quantity of overtaking, which is more efficient and safer for the future autonomous highway driving.
更多
查看译文
关键词
highway driving policies,safe driving policy,autonomous highway driving,deep reinforcement learning method,single reward architecture,MultiReward Architecture based reinforcement,single reward function,multireward functions,multidimensional driving policies,reward functions,highway driving process,three dimensional rewards
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要