FOUNDATIONS FOR LEARNING IN THE AGE OF BIG DATA
Maria Florina Balcan
, Georgia Tech
Date: Tuesday, April 08, 2014
Time: 4:15 PM to 5:15 PM Note: all times are in the Eastern Time Zone
Refreshments: 3:45 PM
Host: Ankur Moitra, TOC, CSAIL, MIT
Contact: Holly A Jones, email@example.com
Relevant URL: http://toc.csail.mit.edu/node/493
Speaker URL: None
firstname.lastname@example.org, email@example.com, firstname.lastname@example.org
TALK: FOUNDATIONS FOR LEARNING IN THE AGE OF BIG DATA
ABSTRACT: With the variety of applications of machine learning across science, engineering, and computing in the age of Big Data, re-examining the underlying foundations of the field has become imperative. In this talk, I will describe new models and algorithms for important emerging paradigms, specifically, interactive learning and distributed learning.
For active learning, where the algorithm itself can ask for labels of carefully chosen examples from a large pool of unannotated data with the goal of minimizing human labeling effort, I will present results giving computationally efficient, optimal label complexity algorithms. I will also discuss learning with more general forms of interaction, as well as unexpected implications of these results for classic supervised learning paradigms.
For distributed learning, I will discuss a model that for the first time addresses the core question of what are the fundamental communication requirements for achieving accurate learning. Broadly, we consider a framework where massive amounts of data is distributed among several locations, and our goal is to learn a low-error predictor with respect to the overall distribution of data using as little communication, and as few rounds of interaction, as possible. We provide general upper and lower bounds on the amount of communication needed to learn a given class, as well as broadly-applicable techniques for achieving communication-efficient learning.
Created by Holly A Jones at Friday, April 04, 2014 at 10:33 AM.