Online Partial Least Square Optimization: Dropping Convexity for Better Efficiency

arXiv preprint arXiv:1702.08134(2017)

Cited 0|Views6
No score
Abstract
Multiview representation learning is very popular for latent factor analysis. It naturally arises in many data analysis, machine learning, and information retrieval applications to model dependent structures among multiple data sources. For computational convenience, existing approaches usually formulate the multiview representation learning as convex optimization problems, where global optima can be obtained by certain algorithms in polynomial time. However, many evidences have corroborated that heuristic nonconvex approaches also have good empirical computational performance and convergence to the global optima, although there is a lack of theoretical justification. Such a gap between theory and practice motivates us to study a nonconvex formulation for multiview representation learning, which can be eciently solved by two stochastic gradient descent (SGD) algorithms. Theoretically, by analyzing the dynamics of the algorithms based on diusion processes, we establish global rates of convergence to the global optima with high probability. Numerical experiments are provided to support our theory.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined