Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
CoRR(2024)
摘要
Despite the success of the Adam optimizer in practice, the theoretical
understanding of its algorithmic components still remains limited. In
particular, most existing analyses of Adam show the convergence rate that can
be simply achieved by non-adative algorithms like SGD. In this work, we provide
a different perspective based on online learning that underscores the
importance of Adam's algorithmic components. Inspired by Cutkosky et al.
(2023), we consider the framework called online learning of updates, where we
choose the updates of an optimizer based on an online learner. With this
framework, the design of a good optimizer is reduced to the design of a good
online learner. Our main observation is that Adam corresponds to a principled
online learning framework called Follow-the-Regularized-Leader (FTRL). Building
on this observation, we study the benefits of its algorithmic components from
the online learning perspective.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要