Doctoral Thesis Scalable methods for storage, processing and analysis of sequencing datasets

Speaker: Deniz Yorukoglu , MIT

Date: Thursday, January 19, 2017

Time: 2:00 PM to 3:00 PM Note: all times are in the Eastern Time Zone

Refreshments: 12:00 AM

Public: Yes

Location: 32-G575

Event Type:

Room Description:

Host: Bonnie Berger, MIT/Math

Contact: Patrice Macaluso, 617-253-3037,

Relevant URL:

Speaker URL: None

Speaker Photo:

Reminders to:

Reminder Subject: TALK: Doctoral Thesis

Massive amounts of next-generation sequencing (NGS) reads generated from sequencing machines around the world have revolutionized biotechnology enabling wide-scale disease and variation studies, personalized medicine and helping us understand our evolutionary history. However, the amount of sequencing data generated every day increases at an exponential rate posing an imminent need for smart algorithmic solutions to handle massive sequencing datasets and efficiently extract the useful knowledge within them. This thesis consists of four research contributions on these two fronts. First, we present a computational framework that leverages the redundancy within large genomic datasets for performing faster read-mapping while improving
sensitivity. Second, we describe a lossy compression method for quality scores within
sequencing datasets that strikingly improves the downstream accuracy for genotyping. Third, we introduce a Bayesian framework for accurate diploid and polyploid haplotype reconstruction of an individual genome using NGS datasets. Lastly, we extend this haplotype reconstruction framework to high-throughput transcriptome sequencing datasets.
Thesis Supervisor: Bonnie Berger
Title: Professor of Applied Mathematics

Research Areas:

Impact Areas:

This event is not part of a series.

Created by Patrice Macaluso Email at Thursday, January 19, 2017 at 10:18 AM.