Learning representations of protein from sequence, structure, and network

Speaker: Jian Peng , University of Illinois, Urbana-Champaign

Date: Wednesday, March 22, 2017

Time: 11:30 AM to 1:00 PM Note: all times are in the Eastern Time Zone

Refreshments: 11:15 AM

Public: Yes

Location: 32-G575

Event Type:

Room Description:

Host: Bonnie Berger

Contact: Patrice Macaluso, 617-253-3037, macaluso@csail.mit.edu

Relevant URL:

Speaker URL: None

Speaker Photo:
None

Reminders to: bergerlab-core@mit.edu, seminars@csail.mit.edu, bioinfo-seminar@lists.csail.mit.edu

Reminder Subject: TALK: Learning representations of protein from sequence, structure, and network

Understanding of protein structure and protein-protein interaction is crucial for studying molecular pathways and gaining insights into various biochemical processes. Data-driven approaches for predicting protein structure and interaction have been recently improved, partially due to the advances in machine learning. The success of machine learning algorithms often depends on data representation, which encodes explanatory factors of variation behind the data. Although our prior knowledge in protein science can help design good representations of proteins, powerful techniques capable of identifying protein patterns and sharing insights across diverse datasets are needed. In this talk, I will discuss three recent work on learning protein representations from sequence, structure and network data. First, I will introduce DeepContact, a deep convolutional neural-network (CNN) based approach that identifies conserved structural motifs, automatically and effectively leveraging patterns of residue-residue contacts to enable accurate inference of contact probabilities. Second, I will discuss DeepFold, another CNN-based approach to extract structural motifs within protein structure to enable accurate and efficient alignment-free structure search. Lastly, I will present Mashup, a feature learning algorithm to integrate protein-protein interaction networks for functional inference. In addition to the state-of-the-art performance, we expect these representation learning algorithms to provide biologically meaningful and deep insights into the organizational structure of protein folds and interaction networks.

Research Areas:

Impact Areas:

See other events that are part of the Bioinformatics Seminar Series 2017.

Created by Patrice Macaluso Email at Monday, March 20, 2017 at 10:53 AM.