Department of Computer Science at UH

University of Houston

Department of Computer Science

In Partial Fulfillment of the Requirements for the Degree of
Master of Science

Rui Zhu

Will defend her thesis


Algorithms and Data Structures to Detect Oncoviruses in Human Cancer Using Next Generation Sequencing Data

Abstract

Evidence suggests human cancer can be induced by viruses. One way to test this hypothesis is to look for viral sequences in the human cancer genome. Next Generation Sequencing (NGS) technology sequences whole human genome in a short period. This opens a door for a systematic analysis of human genome and a thorough search for oncogenic viral sequences in cancer. However, a huge amount of sequencing reads generated by NGS poses a great challenge on the computational part of data analysis in terms of computing speed and memory usages. Data structures such as hash and tree are widely implemented to improve the performance of computing algorithms. Here I described both data structures that have been developed in our center and compared their performance. Hash out performed tree when mapping the reads to a small reference sequence database. Subsequently, real human cancer data were analyzed by using the hash-based mapper and different oncoviral sequences were found in different cancers.

 

Date: Thursday, November 15, 2012
Time: 1:00 PM
Place: 4018-SERC

Faculty, students, and the general public are invited.
Advisor: Prof. Yuriy Fofanov