Title:

Sub-sampled Second-order Newton Methods for Large-scale Machine Learning

Abstract:

A major challenge for large-scale machine learning, and one that will only increase in importance as we develop models that are more and more domain-informed, involves going beyond high-variance first-order optimization methods. Here, we consider the problem of minimizing the sum of a large number of functions over a convex constraint set, a problem that arises in many data analysis and machine learning applications, as well as many more traditional scientific applications. For this class of problems, we establish improved bounds for algorithms that incorporate sub-sampling as a way to improve computational efficiency, while maintaining their original convergence properties. These methods exploit recent results from Randomized Linear Algebra on approximate matrix multiplication. Within the context of second order methods, they provide quantitative convergence results for variants of Newton's methods, where the Hessian and/or the gradient is uniformly or non-uniformly sub-sampled, under much weaker assumptions than prior work.