Markov chain stochastic DCA and applications in deep learning with PDEs regularization

NEURAL NETWORKS(2024)

引用 0|浏览1
暂无评分
摘要
This paper addresses a large class of nonsmooth nonconvex stochastic DC (difference-of-convex functions) programs where endogenous uncertainty is involved and i.i.d. (independent and identically distributed) samples are not available. Instead, we assume that it is only possible to access Markov chains whose sequences of distributions converge to the target distributions. This setting is legitimate as Markovian noise arises in many contexts including Bayesian inference, reinforcement learning, and stochastic optimization in high-dimensional or combinatorial spaces. We then design a stochastic algorithm named Markov chain stochastic DCA (MCSDCA) based on DCA (DC algorithm) -a well-known method for nonconvex optimization. We establish the convergence analysis in both asymptotic and nonasymptotic senses. The MCSDCA is then applied to deep learning via PDEs (partial differential equations) regularization, where two realizations of MCSDCA are constructed, namely MCSDCA-odLD and MCSDCA-udLD, based on overdamped and underdamped Langevin dynamics, respectively. Numerical experiments on time series prediction and image classification problems with a variety of neural network topologies show the merits of the proposed methods.
更多
查看译文
关键词
DC programming and DCA,Markov chain Monte Carlo,Deep learning,Partial differential equations
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要