- Sampling Sketches for Conca...
- Edit Event
- Cancel Event
- Preview Reminder
- Send Reminder
- Other events happening in May 2019

## Sampling Sketches for Concave Sublinear Functions of Frequencies

**Speaker:**
Ofir Geri
, Stanford

**Date:**
Wednesday, May 15, 2019

**Time:**
3:00 PM to 4:00 PM

**Public:**
Yes

**Location:**
32-G575

**Event Type:**
Seminar

**Room Description:**

**Host:**
Govind Ramnarayan, Quanquan Liu, Sitan Chen

**Contact:**
Rebecca Yadegar, ryadegar@csail.mit.edu

**Relevant URL:**

**Speaker URL:**
None

**Speaker Photo:**

None

**Reminders to:**
seminars@csail.mit.edu, theory-seminars@csail.mit.edu

**Reminder Subject:**
TALK: Ofir Geri: Sampling Sketches for Concave Sublinear Functions of Frequencies

We consider massive distributed datasets that consist of elements that are key-value pairs. Our goal is to compute estimates of statistics or aggregates over the data, where the contribution of each key is weighted by a function of its frequency (sum of values of its elements). This fundamental problem has a wealth of applications in data analytics and machine learning.

A common approach is to maintain a sample of keys and estimate statistics from the sample. Ideally, to obtain low-variance estimates we sample keys with probabilities proportional to their contributions. One simple way to do so is to first aggregate the raw data to produce a table of keys and their frequencies, apply our function to the frequency values, and then apply a weighted sampling scheme. This aggregation however requires data structures of size proportional to the number of distinct keys and is too costly when the number is very large. Our main contribution is the design of composable sampling sketches that can be tailored to any concave sublinear function of the frequencies (including log, the moments x^p for 0

**Research Areas:**

Algorithms & Theory

**Impact Areas:**

Created by Rebecca Yadegar at Thursday, May 02, 2019 at 4:23 PM.