Real-Time Rideshare Driver Supply Values Using Online Reinforcement Learning

Benjamin Han,Hyungjun Lee,Sébastien Martin

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining（2022）

引用 5|浏览13

暂无评分

摘要

In this paper, we present Online Supply Values (OSV), a system for estimating the return of available rideshare drivers to match drivers to ride requests at Lyft. Because a future driver state can be accurately predicted from a request destination, it is possible to estimate the expected action value of assigning a ride request to an available driver as a Markov Decision Process using the Bellman Equation. These estimates are updated using temporal difference and are shown to adapt to changing marketplace conditions in real-time. While reinforcement learning has been studied for rideshare dispatch, fully-online approaches without offline priors or other guardrails had never been evaluated in the real world. This work presents the algorithmic changes needed to bridge this gap. OSV is now deployed globally as a core component of Lyft's dispatch matching system. Our A/B user experiments in major US cities measure a +(0.96±0.53)% increase in the request fulfillment rate and a +(0.73±0.22)% increase to profit per passenger session over the previous algorithm.

查看译文

关键词

online reinforcement learning,driver,values,real-time

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要