Accelerated Gradient Descent via Long Steps

arXiv (Cornell University)(2023)

引用 0|浏览4
暂无评分
摘要
Recently Grimmer [1] showed for smooth convex optimization by utilizing longer steps periodically, gradient descent's state-of-the-art O(1/T) convergence guarantees can be improved by constant factors, conjecturing an accelerated rate strictly faster than O(1/T) could be possible. Here we prove such a big-O gain, establishing gradient descent's first accelerated convergence rate in this setting. Namely, we prove a O(1/T^{1.02449}) rate for smooth convex minimization by utilizing a nonconstant nonperiodic sequence of increasingly large stepsizes. It remains open if one can achieve the O(1/T^{1.178}) rate conjectured by Das Gupta et. al. [2] or the optimal gradient method rate of O(1/T^2). Big-O convergence rate accelerations from long steps follow from our theory for strongly convex optimization, similar to but somewhat weaker than those concurrently developed by Altschuler and Parrilo [3].
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要