Bridging Theory and Practice in Parallel Subgraph Computations

Speaker: Jessica Shi , MIT CSAIL

Date: Wednesday, April 20, 2022

Time: 4:00 PM to 5:00 PM Note: all times are in the Eastern Time Zone

Public: Yes

Location: https://mit.zoom.us/meeting/register/tJMtdOqgrTwsGNV_k0Nk6JjLMk-G6GFzTRyk

Event Type: Seminar

Room Description: https://mit.zoom.us/meeting/register/tJMtdOqgrTwsGNV_k0Nk6JjLMk-G6GFzTRyk

Host: Julian Shun, MIT CSAIL

Contact: Linda Lynch, lindalynch@csail.mit.edu

Relevant URL:

Speaker URL: http://web.mit.edu/jeshi/www/

Speaker Photo:
None

Reminders to: seminars@csail.mit.edu, fast-code-seminar@lists.csail.mit.edu, pl@csail.mit.edu, commit@lists.csail.mit.edu, toc@csail.mit.edu

Reminder Subject: TALK: Bridging Theory and Practice in Parallel Subgraph Computations

******************IMPORTANT NOTE ABOUT REGISTRATION******************
- The registration link for Spring 2022 is the same as the link from Fall 2021.
- Please save the Zoom link that you receive after you register. This link will stay the same for all subsequent Fast Code seminars.
- Zoom does not recognize a second registration, and will not send out the link a second time. The organizers will not be notified of any second registration.
- If you have any problems with registration, please contact lindalynch@csail.mit.edu by 3:30pm on the day of the seminar, so that we can try to resolve it before the seminar begins.
*********************************************************************

Abstract: Discovering dense substructures in graphs is a fundamental topic in graph mining, and has been studied across many areas including computational biology, spam and fraud detection, and large-scale network analysis. In particular, k-cliques are the core building blocks of dense substructures, and finding k-cliques in a graph is a classic problem with a long history of study in both theory and practice. In this work, we present new parallel algorithms for k-clique counting and listing, which we then use in new parallel subgraph decomposition algorithms that capture higher-order structures in networks. Specifically, we design new parallel algorithms for the k-clique densest subgraph problem and the nucleus decomposition problem, which generalizes the notions of k-cores and k-trusses.

We show that our parallel k-clique counting / listing algorithm has polylogarithmic span (parallel time) and is work-efficient (matches the work of the best sequential algorithm) for sparse graphs. We also design two new parallel work-efficient algorithms for approximating the k-clique densest subgraph. Furthermore, we present a novel work-efficient parallel algorithm with low span for the nucleus decomposition problem, which significantly improves upon the only existing parallel nucleus decomposition algorithm (Sariyuce et al., PVLDB 2018). In addition to our theoretical results, we implement these algorithms, and we introduce several new practical optimizations, including tradeoffs between performance and space complexity, and a new multi-level hash table structure to store information on cliques space-efficiently and a technique for traversing this structure cache-efficiently. We show that our implementations are fast in practice and outperform the existing state-of-the-art implementations.

Based on joint work with Laxman Dhulipala and Julian Shun.
https://arxiv.org/abs/2002.10047 (ACDA 2021)
https://arxiv.org/abs/2111.10980 (VLDB 2022)

Bio: Jessica Shi is a 4th-year PhD student at MIT in the EECS department, where she is advised by Julian Shun. She is also a Student Researcher at Google on the Graph Mining team, where she is mentored by Jakub Łącki. Jessica’s current research interests include developing shared-memory parallel graph algorithms with provable theoretical guarantees and efficient scalable implementations, with a focus on subgraph decomposition and clustering algorithms. She is supported by a 2018 NSF Graduate Research Fellowship, and she has previously received her S.M. in computer science from MIT, and her A.B. in mathematics from Princeton University.

Research Areas:
Algorithms & Theory, Programming Languages & Software

Impact Areas:
Big Data

This event is not part of a series.

Created by Julian J. Shun Email at Wednesday, April 13, 2022 at 1:01 AM.