Sequence-structure-function modeling for DNA

Speaker: Katie Pollard , University of California San Francisco

Date: Wednesday, March 10, 2021

Time: 11:30 AM to 1:00 PM Note: all times are in the Eastern Time Zone

Public: Yes

Location: Zoom:

Event Type: Seminar

Room Description:

Host: Bonnie Berger, CSAIL & Mathematics

Contact: Patrice Macaluso, 617-253-3037,

Relevant URL:

Speaker URL: None

Speaker Photo:

Reminders to:,,

Reminder Subject: TALK: To be announced

The human genome sequence folds in three dimensions (3D) into a rich variety of locus-specific contact patterns. Despite growing appreciation for the importance of 3D genome folding in evolution and disease, we lack models for relating mutations in genome sequences to changes in genome structure and function. Towards that goal, we discovered that the organization of gene regulatory domains within chromosomes and the specific sequences that sit at boundaries between domains are under strong negative selection in the human population and over primate evolution. Motivated by this signature of functional importance, we developed a deep convolutional neural network, called Akita, that accurately predicts genome folding from DNA sequence alone. Representations learned by Akita underscore the importance of the structural protein CTCF but also reveal a complex grammar beyond CTCF binding sites that underlies genome folding. Akita enabled rapid in silico predictions for effects of sequence mutagenesis on the 3D genome, including differences in genome folding across species and in disease cohorts, which we are validating with CRISPR-edited genomes. This prediction-first strategy exemplifies my vision for a more proactive, rather than reactive, role for data science in biomedical research.

Research Areas:

Impact Areas:

See other events that are part of the Bioinformatics Seminar Series 2021.

Created by Patrice Macaluso Email at Wednesday, March 03, 2021 at 2:09 PM.