Chrome Extension
WeChat Mini Program
Use on ChatGLM

A Self-Adaptive Reinforcement-Exploration Q-Learning Algorithm

SYMMETRY-BASEL(2021)

Cited 13|Views6
No score
Abstract
Directing at various problems of the traditional Q-Learning algorithm, such as heavy repetition and disequilibrium of explorations, the reinforcement-exploration strategy was used to replace the decayed epsilon-greedy strategy in the traditional Q-Learning algorithm, and thus a novel self-adaptive reinforcement-exploration Q-Learning (SARE-Q) algorithm was proposed. First, the concept of behavior utility trace was introduced in the proposed algorithm, and the probability for each action to be chosen was adjusted according to the behavior utility trace, so as to improve the efficiency of exploration. Second, the attenuation process of exploration factor epsilon was designed into two phases, where the first phase centered on the exploration and the second one transited the focus from the exploration into utilization, and the exploration rate was dynamically adjusted according to the success rate. Finally, by establishing a list of state access times, the exploration factor of the current state is adaptively adjusted according to the number of times the state is accessed. The symmetric grid map environment was established via OpenAI Gym platform to carry out the symmetrical simulation experiments on the Q-Learning algorithm, self-adaptive Q-Learning (SA-Q) algorithm and SARE-Q algorithm. The experimental results show that the proposed algorithm has obvious advantages over the first two algorithms in the average number of turning times, average inside success rate, and number of times with the shortest planned route.
More
Translated text
Key words
reinforcement learning,Q-Learning algorithm,self-adaptive Q-Learning algorithm,self-adaptive reinforcement-exploration strategy,path planning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined