Analyzing non-convex optimization: matrix completion and linear residual networks

Speaker: Tengyu Ma , Princeton

Date: Wednesday, November 30, 2016

Time: 4:00 PM to 5:00 PM Note: all times are in the Eastern Time Zone

Public: Yes

Location: G575

Event Type:

Room Description:

Host: Aleksander Madry

Contact: Aleksander Madry,

Relevant URL:

Speaker URL: None

Speaker Photo:

Reminders to:

Reminder Subject: TALK: Analyzing non-convex optimization: matrix completion and linear residual networks

Non-convex optimization is the main algorithmic technique behind
many state-of-art results in machine learning. It has been conjectured that
the landscape of many training objectives has the nice geometric property
that “all local minima are global minima,” and thus admits efficient
optimization algorithms.

In the first part of the talk, I will show that the optimization landscape
of matrix completion — a famous problem in ML with wide applications in
recommender system and collaborative filtering — does have this property.
This implies that (stochastic) gradient descent from arbitrary
initialization can solve matrix completion in polynomial time.

Next, we will discuss linear residual networks, as a simplified model
towards the first-cut understanding of residual networks. We will give a
strikingly simple proof that arbitrarily deep linear residual networks have
no spurious critical points (=points with vanishing gradient that are not
global minima). In contrast, the landscape of standard linear neural
networks does have spurious critical points. This demonstrates that
re-parameterization using the identity shortcut connection can make the
optimization easier.

Based on joint works with Rong Ge and
Jason D. Lee, and with Moritz Hardt.

Research Areas:

Impact Areas:

See other events that are part of the Algorithms and Complexity Seminar Series 2016/2017.

Created by Aleksander Madry Email at Friday, November 25, 2016 at 10:43 PM.