Optimal Algorithms for Online Convex Optimization with Adversarial Constraints
arxiv(2023)
摘要
A well-studied generalization of the standard online convex optimization
(OCO) is constrained online convex optimization (COCO). In COCO, on every
round, a convex cost function and a convex constraint function are revealed to
the learner after the action for that round is chosen. The objective is to
design an online policy that simultaneously achieves a small regret while
ensuring a small cumulative constraint violation (CCV) against an adaptive
adversary interacting over a horizon of length T. A long-standing open
question in COCO is whether an online policy can simultaneously achieve
O(√(T)) regret and O(√(T)) CCV without any restrictive assumptions.
For the first time, we answer this in the affirmative and show that an online
policy can simultaneously achieve O(√(T)) regret and
Õ(√(T)) CCV. Furthermore, in the case of strongly convex cost and
convex constraint functions, the regret guarantee can be improved to O(log
T) while keeping the CCV bound the same as above. We establish these results
by effectively combining the adaptive regret bound of the AdaGrad algorithm
with Lyapunov optimization - a classic tool from control theory. Surprisingly,
the analysis is short and elegant.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要