- Harry Lang: Coresets for k-...
- Edit Event
- Cancel Event
- Preview Reminder
- Send Reminder
- Other events happening in July 2017

## Harry Lang: Coresets for k-Means-Type Problems on Streams

**Speaker:**
Harry Lang
, Johns Hopkins University

**Date:**
Wednesday, July 05, 2017

**Time:**
4:00 PM to 5:00 PM
** Note: all times are in the Eastern Time Zone**

**Public:**
Yes

**Location:**
32-G575

**Event Type:**

**Room Description:**

**Host:**
Pritish Kamath and Akshay Degwekar, CSAIL MIT

**Contact:**
Rebecca Yadegar, ryadegar@csail.mit.edu

**Speaker URL:**
None

**Speaker Photo:**

None

**Reminders to:**
seminars@csail.mit.edu, theory-seminars@csail.mit.edu

**Reminder Subject:**
TALK: RESCHEDULED: July 5, 2017: Harry Lang: Coresets for k-Means-Type Problems on Streams

Abstract: Let f be a non-negative symmetric dissimilarity measure. Given sets of points P (the input data) and C (a "query"), we define F(P,C) = \sum_{p \in P} \min_{c \in C} f(p,c). An example that fits into this framework is k-means clustering where f = distance^2 and we restrict |C| = k.

An $\epsilon$-coreset for P is a set Q such that F(Q,C) is within a factor of $(1 + \epsilon)$ of F(P,C) for any query C. For data reduction, the goal is to construct a coreset Q of minimal cardinality. We introduce improved coreset constructions for a wide class of functions that obey a weak form of the triangle inequality: this includes k-means, k-median, Tukey's bi-weight function, and the Huber loss function.

In the streaming model, the input P (a set of n points) arrives sequentially and algorithms attempt to maintain a coreset using a minimal amount of memory. We introduce a new technique for maintaining coresets in the streaming model with O(1) overhead. For example, the previous state-of-the-art for maintaining an $\epsilon$-coreset for metric k-means on a stream required the storage of $O(\epsilon^{-4} k \log k \log^6 n)$ points whereas our method requires $O(\epsilon^{-2} k \log k \log n)$ points.

Joint work with Vladimir Braverman and Dan Feldman.

**Research Areas:**

**Impact Areas:**

Created by Rebecca Yadegar at Monday, June 05, 2017 at 4:50 PM.