- Approximate Nearest Neighbo...
- Edit Event
- Cancel Event
- Preview Reminder
- Send Reminder
- Other events happening in September 2022
Approximate Nearest Neighbor Search algorithms for web-scale search and recommendation
Speaker:
Harsha Simhadri
, Microsoft Research
Date: Wednesday, September 28, 2022
Time: 4:00 PM to 5:00 PM Note: all times are in the Eastern Time Zone
Public: Yes
Location: https://mit.zoom.us/meeting/register/tJMtdOqgrTwsGNV_k0Nk6JjLMk-G6GFzTRyk
Event Type: Seminar
Room Description: https://mit.zoom.us/meeting/register/tJMtdOqgrTwsGNV_k0Nk6JjLMk-G6GFzTRyk
Host: Julian Shun, MIT CSAIL
Contact: Linda Lynch, 715-2459, lindalynch@csail.mit.edu
Relevant URL: http://fast-code.csail.mit.edu/
Speaker URL: https://harsha-simhadri.org/
Speaker Photo:
None
Reminders to:
fast-code-seminar@lists.csail.mit.edu, seminars@csail.mit.edu, pl@csail.mit.edu, commit@lists.csail.mit.edu, mitml@mit.edu
Reminder Subject:
TALK: Approximate Nearest Neighbor Search algorithms for web-scale search and recommendation
******************IMPORTANT NOTE ABOUT REGISTRATION******************
- The registration link for Fall 2022 is the same as the link from Spring 2022.
- Please save the Zoom link that you receive after you register. This link will stay the same for all subsequent Fast Code seminars.
- Zoom does not recognize a second registration, and will not send out the link a second time. The organizers will not be notified of any second registration.
- If you have any problems with registration, please contact lindalynch@csail.mit.edu by 3pm on the day of the seminar, so that we can try to resolve it before the seminar begins.
*********************************************************************
Abstract: Web-scale search and recommendation scenarios increasingly use Approximate Nearest Neighbor Search (ANNS) algorithms to index and retrieve objects based on the similarity of their learnt representations in a geometric space. Since these scenarios often span billions or trillions of objects, efficient and scalable ANNS algorithms are critical to making these systems practical. However, most algorithms studied in literature either focus on million-scale datasets or do not have features necessary for practical indices, e.g., support for real-time updates.
In this talk we discuss empirical progress on this problem. Specifically, we present DiskANN, the first external memory ANNS algorithm that can index a billion points and serve queries at interactive latencies (few milliseconds) on a commodity machine. This represents an order of magnitude more points indexed per machine than previous work. In addition, the index allows real-time updates and its in-memory performance compares well with other state of the art indices.
We will conclude with some open problems in this space, e.g., linearizability for updates, and scaling to trillion-point indices.
Joint work with Ravishankar Krishnaswamy, Sujas J Subramanya, Aditi Singh, Rohan Kadekodi, Devvrit, Shikhar Jaiswal, Magdalen Dobson, Siddharth Gollapudi, Neel Karia, Varun Sivasankaran.
Bio: Harsha Simhadri is a Principal Researcher at Microsoft Research. He enjoys developing new algorithms with a view towards future platforms and practical systems. Recent examples include algorithms for web-scale nearest-neighbor search deployed in various Microsoft search and recommendation scenarios, and new ML operators and architectures for tiny IoT and edge devices. He previously worked on parallel algorithms and run-times with provable guarantees for multi-core processors for his PhD thesis at Carnegie Mellon University.
Research Areas:
Algorithms & Theory, AI & Machine Learning, Programming Languages & Software
Impact Areas:
Big Data
Created by Julian J. Shun at Friday, September 23, 2022 at 2:23 PM.