Minimizing the Cost of Spatiotemporal Searches Based on Reinforcement Learning with Probabilistic States

WIRELESS COMMUNICATIONS & MOBILE COMPUTING(2022)

引用 0|浏览4
暂无评分
摘要
Portraying the trajectories of certain vehicles effectively is of great significance for urban public safety. Specially, we aim to determine the location of a vehicle at a specific past moment. In some situations, the waypoints of the vehicle's trajectory are not directly available, but the vehicle's image may be contained in massive camera video records. Since these records are only indexed by location and moment, rather than by contents such as license plate numbers, finding the vehicle from these records is a time-consuming task. To minimize the cost of spatiotemporal search (a spatiotemporal search means the effort to check whether the vehicle appears at a specified location at a specified moment), this paper proposes a reinforcement learning algorithm called Quasi-Dynamic Programming (QDP), which is an improved Q-learning. QDP selects the searching moment iteratively based on known past locations, considering both the cost efficiency of the current action and its potential impact on subsequent actions. Unlike traditional Q-learning, QDP has probabilistic states during training. To address the problem of probabilistic states, we make the following contributions: 1) replaces the next state by multiple states of a probability distribution; 2) estimates the expected cost of subsequent actions to calculate the value function; 3) creates a state and an action randomly in each loop to train the value function progressively. Finally, experiments are conducted using real-world vehicle trajectories, and the results show that the proposed QDP is superior to the previous greedy-based algorithms and other baselines.
更多
查看译文
关键词
spatiotemporal searches,reinforcement learning,cost
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要