Beyond Convolutional Neural Networks: Differentiable Visual Computing

Speaker: Tzu-Mao Li , UC Berkeley

Date: Wednesday, February 26, 2020

Time: 3:00 PM to 4:15 PM Note: all times are in the Eastern Time Zone

Public: Yes

Location: 32-G882

Event Type: Seminar

Room Description: 32-G882

Host: Michael Carbin

Contact: Nathan Higgins, nhiggins@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
None

Reminders to: seminars@csail.mit.edu

Reminder Subject: TALK: Beyond Convolutional Neural Networks: Differentiable Visual Computing

Abstract:
While convolutional neural networks have become powerful tools for processing visual data, their inflexibility raises several challenges. Firstly, most modern architectures work on 2D, and it is difficult to embed 3D knowledge. Secondly, neural networks are by design over-parametrized and have millions or billions of parameters. It is challenging to make the networks run fast for high-resolution images and videos, on mobile devices. Finally, neural networks are difficult to debug and control as their behaviors are mostly governed by their parameters and the training data.

On the other hand, classical graphics algorithms that explicitly model the computation are less impacted by these issues. Still, they often do not apply as broadly as modern data-driven methods. I will talk about my research on connecting classical graphics algorithms with modern data-driven methods, by making graphics algorithms differentiable to enable optimization. This involves challenges in both algorithms and systems. Many graphics algorithms, such as 3D rendering, include discontinuities and need to be taken care of when being differentiated. On top of this, writing and deriving efficient derivative code for graphics algorithms is tedious and error-prone. Deep learning frameworks that are designed for a small number of high-throughput neural network layers such as convolution or matrix multiplication are not sufficient for complex graphics pipelines. I develop differentiable systems for 3D rendering, image processing, and physical simulation that address these challenges.

Bio:
Tzu-Mao Li is a postdoc at the EECS department of UC Berkeley, working with Jonathan Ragan-Kelley. His research focuses on the interactions between three domains: visual computing, statistical learning, and programming systems. He connects classical graphics and imaging algorithms with modern data-driven methods to facilitate physical understanding. He uses mathematical tools from statistics and machine learning that broadly apply to graphics, vision, or even compiler problems. He also develops programming systems that simplify the efficient implementation and mathematical derivations of learnable visual computing algorithms. He did his Ph.D. in the computer graphics group at MIT CSAIL, advised by Fr├ędo Durand. He received his B.S. and M.S. degrees in computer science and information engineering from National Taiwan University in 2011 and 2013, respectively. During his time at National Taiwan University, he was a member of the graphics group at Communication and Multimedia Lab, where he worked with Yung-Yu Chuang.

Research Areas:

Impact Areas:

This event is not part of a series.

Created by Nathan Higgins Email at Tuesday, February 25, 2020 at 2:47 PM.