Beyond Convolutional Neural Networks: Differentiable Visual Computing
, UC Berkeley
Date: Wednesday, February 26, 2020
Time: 3:00 PM to 4:15 PM Note: all times are in the Eastern Time Zone
Event Type: Seminar
Room Description: 32-G882
Host: Michael Carbin
Contact: Nathan Higgins, firstname.lastname@example.org
Speaker URL: None
TALK: Beyond Convolutional Neural Networks: Differentiable Visual Computing
While convolutional neural networks have become powerful tools for processing visual data, their inflexibility raises several challenges. Firstly, most modern architectures work on 2D, and it is difficult to embed 3D knowledge. Secondly, neural networks are by design over-parametrized and have millions or billions of parameters. It is challenging to make the networks run fast for high-resolution images and videos, on mobile devices. Finally, neural networks are difficult to debug and control as their behaviors are mostly governed by their parameters and the training data.
On the other hand, classical graphics algorithms that explicitly model the computation are less impacted by these issues. Still, they often do not apply as broadly as modern data-driven methods. I will talk about my research on connecting classical graphics algorithms with modern data-driven methods, by making graphics algorithms differentiable to enable optimization. This involves challenges in both algorithms and systems. Many graphics algorithms, such as 3D rendering, include discontinuities and need to be taken care of when being differentiated. On top of this, writing and deriving efficient derivative code for graphics algorithms is tedious and error-prone. Deep learning frameworks that are designed for a small number of high-throughput neural network layers such as convolution or matrix multiplication are not sufficient for complex graphics pipelines. I develop differentiable systems for 3D rendering, image processing, and physical simulation that address these challenges.
Tzu-Mao Li is a postdoc at the EECS department of UC Berkeley, working with Jonathan Ragan-Kelley. His research focuses on the interactions between three domains: visual computing, statistical learning, and programming systems. He connects classical graphics and imaging algorithms with modern data-driven methods to facilitate physical understanding. He uses mathematical tools from statistics and machine learning that broadly apply to graphics, vision, or even compiler problems. He also develops programming systems that simplify the efficient implementation and mathematical derivations of learnable visual computing algorithms. He did his Ph.D. in the computer graphics group at MIT CSAIL, advised by Frédo Durand. He received his B.S. and M.S. degrees in computer science and information engineering from National Taiwan University in 2011 and 2013, respectively. During his time at National Taiwan University, he was a member of the graphics group at Communication and Multimedia Lab, where he worked with Yung-Yu Chuang.
Created by Nathan Higgins at Tuesday, February 25, 2020 at 2:47 PM.