Analyzing non-convex optimization: matrix completion and linear residual networks
Date: Wednesday, November 30, 2016
Time: 4:00 PM to 5:00 PM Note: all times are in the Eastern Time Zone
Host: Aleksander Madry
Contact: Aleksander Madry, email@example.com
Speaker URL: None
TALK: Analyzing non-convex optimization: matrix completion and linear residual networks
Non-convex optimization is the main algorithmic technique behind
many state-of-art results in machine learning. It has been conjectured that
the landscape of many training objectives has the nice geometric property
that all local minima are global minima, and thus admits efficient
In the first part of the talk, I will show that the optimization landscape
of matrix completion a famous problem in ML with wide applications in
recommender system and collaborative filtering does have this property.
This implies that (stochastic) gradient descent from arbitrary
initialization can solve matrix completion in polynomial time.
Next, we will discuss linear residual networks, as a simplified model
towards the first-cut understanding of residual networks. We will give a
strikingly simple proof that arbitrarily deep linear residual networks have
no spurious critical points (=points with vanishing gradient that are not
global minima). In contrast, the landscape of standard linear neural
networks does have spurious critical points. This demonstrates that
re-parameterization using the identity shortcut connection can make the
Based on joint works https://arxiv.org/abs/1605.07272 with Rong Ge and
Jason D. Lee, and https://arxiv.org/abs/1611.04231 with Moritz Hardt.
Created by Aleksander Madry at Friday, November 25, 2016 at 10:43 PM.