Talk: Sepideh Mahabadi: Composable Core-sets for Diversity and Coverage Maximization, and Its Application in Diverse Near Neighbor Problem

Speaker: Sepideh Mahabadi , MIT

Date: Wednesday, May 07, 2014

Time: 4:00 PM to 5:00 PM Note: all times are in the Eastern Time Zone

Public: Yes

Location: 32-G575

Event Type:

Room Description:

Host: Ilya Razenshteyn

Contact: Rebecca Yadegar, ryadegar@csail.mit.edu

Relevant URL: http://www.ilyaraz.org/acseminar/

Speaker URL: None

Speaker Photo:
None

Reminders to: theory-seminars@csail.mit.edu, seminars@csail.mit.edu

Reminder Subject: TALK: Sepideh Mahabadi: Composable Core-sets for Diversity and Coverage Maximization, and Its Application in Diverse Near Neighbor Problem

Abstract: This talk consists of two parts.

In the first part, we consider efficient construction of ``composable core-sets" for basic diversity and coverage maximization problems. A core-set for a point-set in a metric space is a subset of the point-set with the property that an approximate solution to the whole point-set can be obtained given the core-set alone. A composable core-set has the property that for a collection of sets, the approximate solution to the union of the sets in the collection can be obtained given the union of the composable core-sets for the point sets in the collection. Using composable core-sets one can obtain efficient solutions to a wide variety of massive data processing applications, including nearest neighbor search, streaming algorithms and map-reduce computation.

Our main results are algorithms for constructing composable core-sets for several notions of ``diversity objective functions", a topic that attracted a significant amount of research over the last few years. The composable core-sets we construct are small and accurate: their approximation factor almost matches that of the best ``off-line" algorithms for the relevant optimization problems (up to a constant factor). Moreover, we also show applications of our results to diverse nearest neighbor search, streaming algorithms and map-reduce computation. Finally, we show that for an alternative notion of diversity maximization based on the maximum coverage problem small composable core-sets do not exist.

In the second part, motivated by the recent research on diversity-aware search, we investigate the k-diverse near neighbor reporting problem. The problem is defined as follows: given a query point q, report the "maximum diversity" set S of k points in the ball of radius r around q. The diversity of a set S is measured by the minimum distance between any pair of points in S (the higher, the better). We present two approximation algorithms for the case where the points live in a d-dimensional Hamming space. Our algorithms guarantee query times that are sub-linear in n and only polynomial in the diversity parameter k, as well as the dimension d.

For low values of k, our algorithms achieve sub-linear query times even if the number of points within distance r from a query q is linear in n. To the best of our knowledge, these are the first known algorithms of this type that offer provable guarantees.

The first part is a joint work with Piotr Indyk, Mohammad Mahdian, and Vahab S. Mirrokni; and the second part is a joint work with Sofiane Abbar, Sihem Amer-Yahia, Piotr Indyk, and Kasturi Varadarajan.

Research Areas:

Impact Areas:

See other events that are part of the Algorithms and Complexity Series Fall 2013 / Spring 2014.

Created by Rebecca Yadegar Email at Monday, May 05, 2014 at 2:36 PM.