Software-Defined Far Memory in Warehouse-Scale Computers

Speaker: Junwhan Ahn , Google

Date: Friday, April 19, 2019

Time: 2:00 PM to 3:00 PM

Public: Yes

Location: 32-G449 - KIVA

Event Type: Seminar

Room Description:

Host: Professor Daniel Sanchez, CSG - CSAIL - MIT

Contact: Sally O. Lee, 3-6837, sally@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
None

Reminders to: seminars@csail.mit.edu

Reminder Subject: TALK: Software-Defined Far Memory in Warehouse-Scale Computers

Increasing memory demand and slowdown in technology scaling pose important challenges to total cost of ownership (TCO) of warehouse-scale computers (WSCs). One promising idea to reduce the memory TCO is to add a cheaper, but slower, "far memory" tier and use it to store infrequently accessed (or cold) data. However, introducing a far memory tier brings new challenges around dynamically responding to workload diversity and churn, minimizing stranding of capacity, and addressing brownfield (legacy) deployments.

We present a novel software-defined approach to far memory that proactively compresses cold memory pages to effectively create a far memory tier in software. Our end-to-end system design encompasses new methods to define performance service-level objectives (SLOs), a mechanism to identify cold memory pages while meeting the SLO, and our implementation in the OS kernel and node agent. Additionally, we design learning-based autotuning to periodically adapt our design to fleet-wide changes without a human in the loop. Our system has been successfully deployed across Google’s WSC since 2016, serving thousands of production services. Our software-defined far memory is significantly cheaper (67% or higher memory cost reduction) at relatively good access speeds (6us) and allows us to store a significant fraction of infrequently accessed data (on average, 20%), translating to significant TCO savings at warehouse scale.

This paper will be presented at ASPLOS 2019, Providence, RI, on April 15. YouTube video lightning talk for this paper is available at https://youtu.be/aKddds6jn1s.

*Bio*: Junwhan Ahn is a Senior Software Engineer in the Platforms team at Google. He received the Ph.D. degree in electrical engineering and computer science from Seoul National University in 2017. His past research focused on memory system design, emerging memory technologies, and processing in memory. His current research interests include optimizing memory/storage system design for datacenter workloads and using machine learning to optimize systems.

** Refreshments at 1:45 pm

Research Areas:
Systems & Networking

Impact Areas:

This event is not part of a series.

Created by Sally O. Lee Email at Tuesday, April 16, 2019 at 2:41 PM.