Accelerating Genomic Sequence Compression with Graphics Processors

MPhil Thesis Defence


Title: "Accelerating Genomic Sequence Compression with Graphics Processors"

By

Miss Yuwei Tan


Abstract

A modern sequencing instrument is able to generate hundreds of millions of 
short reads of genomic data on a daily basis. As a result, there is an 
urgent need to develop fast algorithms that can efficiently handle, store, 
compress, access, and decompress these data. This thesis focuses on 
specialized compression schemes to quickly compress and decompress large 
genomic data. Specifically, we developed light-weight compression schemes 
for FASTQ/FASTA format data, as well as for sequence alignment output 
data. Furthermore, we leverage the Graphics Processing Unit's (GPU's) 
massively parallel architecture, high density of arithmetic logic units, 
and superior memory bandwidth to significantly accelerate compression and 
decompression. We demonstrate that our GPU-powered custom compression 
schemes achieve a compression ratio similar to or better than those by 
general-purpose compressing algorithms for sequence data. Finally, we 
integrate our compression techniques into the state-of-the-art alignment 
tools and accelerate the overall speed by an order of magnitude, mainly 
due to the effective reduction of the IO cost.


Date:			Friday, 25 May 2012

Time:			2:00pm – 4:00pm

Venue:			Room 1504
 			Lifts 25/26

Committee Members:	Dr. Qiong Luo (Supervisor)
 			Prof. Frederick Lochovsky (Chairperson)
 			Dr. Raymond Wong (ECE)


**** ALL are Welcome ****