A Primal-Dual Online Algorithm For Online Matching Problem In Dynamic Environments

THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE(2021)

引用 1|浏览91
暂无评分
摘要
Recently, the online matching problem has attracted much attention due to its wide application on real-world decision-making scenarios. In stationary environments, by adopting the stochastic user arrival model, existing methods are proposed to learn dual optimal prices and are shown to achieve a fast regret bound. However, the stochastic model is no longer a proper assumption when the environment is changing, leading to an optimistic method that may suffer poor performance. In this paper, we study the online matching problem in dynamic environments in which the dual optimal prices are allowed to vary over time. We bound the dynamic regret of online matching problem by the sum of two quantities, including a regret of online max-min problem and a dynamic regret of online convex optimization (OCO) problem. Then we propose a novel online approach named Primal-Dual Online Algorithm (PDOA) to minimize both quantities. In particular, PDOA adopts the primal-dual framework by optimizing dual prices with the online gradient descent (OGD) algorithm to eliminate the online max-min problem's regret. Moreover, it maintains a set of OGD experts and combines them via an expert-tracking algorithm, which gives a sublinear dynamic regret bound for the OCO problem. We show that PDOA achieves an O(K root T(1 + P-T)) dynamic regret where K is the number of resources, T is the number of iterations and P-T is the path-length of any potential dual price sequence that reflects the dynamic environment. Finally, experiments on real applications exhibit the superiority of our approach.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要