Comprehensive Static Instrumentation for Dynamic-Analysis Tools

Speaker: I-Ting Angelina Lee , Washington University in St Louis

Date: Friday, December 09, 2016

Time: 10:00 AM to 12:00 PM Note: all times are in the Eastern Time Zone

Public: Yes

Location: 32-G575

Event Type:

Room Description:

Host: Charles Leiserson

Contact: TB Schardl, neboat@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
None

Reminders to: sys-fac@lists.csail.mit.edu, toc-faculty@csail.mit.edu

Reminder Subject: TALK: Comprehensive Static Instrumentation for Dynamic-Analysis Tools

ABSTRACT:

The CSI framework provides comprehensive static instrumentation that a compiler inserts into a program-under-test so that dynamic-analysis tools --- memory checkers, race detectors, cache simulators, performance profilers, code-coverage analyzers, etc. --- can observe and investigate runtime behavior. Heretofore, tools based on compiler instrumentation would each separately modify the compiler to insert their own instrumentation. In contrast, CSI inserts a standard collection of instrumentation hooks into the program-under-test. Each CSI-tool is implemented as a library that defines relevant hooks, and the remaining hooks are "nulled" out and elided during either compile-time or link-time optimization, resulting in instrumented runtimes on par with custom instrumentation. CSI allows many compiler-based tools to be written as simple libraries without modifying the compiler, greatly lowering the bar for developing dynamic-analysis tools.

We have defined a standard API for CSI and modified LLVM to insert CSI hooks into the compiler's internal representation (IR) of the program. The API organizes IR objects --- such as functions, basic blocks, and memory accesses --- into flat and compact ID spaces, which not only greatly simplifies the building of tools, but enables significantly faster maintenance of IR-object data than do traditional hash tables. CSI hooks contain a "property" parameter that allows tools to customize behavior based on static information without introducing overhead. CSI provides "forensic" tables that tools can use to relate IR objects to source-code locations and IR objects to each other.
To evaluate the efficacy of CSI, we implemented several CSI-tools. One of our studies shows that compiling with CSI and linking with the ``null'' CSI-tool produces a tool-instrumented executable that is as fast as the original uninstrumented code. Another study with a CSI port of Google's ThreadSanitizer shows that the CSI-tool approaches the performance of Google's custom compiler-based implementation.
In this talk, I will describe the design and implementation of CSI, present preliminary data to demonstrate the effectiveness and simplicity of CSI, and involve the audience with a hands-on demo to build a simple CSI tool.

Materials: To follow along with the demo, please bring a laptop with CSI already set up. You can download and install CSI onto a Linux x86-64 machine as follows:

0. Make sure that you have the following prerequisites: gold linker, binutils-devel, and cmake
To see if you have gold linker, do:
> whereis ld
and you should see some version that says ld.gold.
If you invoke that with -v, you should see:

> /usr/bin/ld.gold -v
GNU gold (version 2.23.52.0.1-55.el7 20130226) 1.11

1. Create a directory where you want the CSI-LLVM compiler to live
> mkdir CSI-LLVM
> cd CSI-LLVM

2. Download and build the CSI version of llvm by:
> git clone --recursive git@github.com:csi-llvm/llvm.git
> cd llvm/csi
> ./build.sh
The build script may fail, because your gold linker and plugin-api.h (from binutils-devel) are installed in a different locations. If so, update the build script to reflect their correct locations.

BIO:
I-Ting Angelina Lee is an Assistant Professor in the Department of Computer Science and Engineering at Washington University in St. Louis in Fall 2014. Prior to that, she worked with the Supertech research group in Massachusetts Institute of Technology lead by Professor Charles Leiserson for her graduate study and subsequently as a postdoctoral associate.
Dr. Lee's research focuses on advancing software technologies for parallel computing. She is interested in many aspects of parallel computing, including designing programming models and linguistic constructs to simplify parallel programming, developing runtime and operating system support to execute multithreaded programs efficiently, and building software tools to aid debugging and performance engineering of multithreaded code. She received the Best Paper Award at the Symposium on Parallelism in Algorithms and Architectures (SPAA) in 2012.

Research Areas:

Impact Areas:

This event is not part of a series.

Created by Cree Bruins Email at Thursday, December 08, 2016 at 11:01 AM.