Dissertation Proposal
In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
Mohammad Tanvir Rahman
will defend his dissertation proposal
Performance Models for Parallel Applications Under Failures
Abstract
Due to the growing size of compute clusters, large scale parallel applications increasingly have to deal with hardware malfunctions and other failure scenarios during execution. The overall goal of this research is to get good performance of parallel applications despite failures. This dissertation introduces various mathematical models and techniques to improve resilience of parallel applications.
Part one derives a mathematical model to minimize job completion time for inter-dependent parallel processes running in a volunteer environment by finding the optimal checkpoint interval. Validation is performed with a sample real world application running on a pool of distributed volunteer nodes.
The second part evaluates the performance of Hadoop MapReduce applications, with different execution parameters and under different failure scenarios. The dissertation also presents new techniques to inject failures into MapReduce applications to simulate real world failures. The final goal is to develop performance models for MapReduce applications considering node and process failures, which allows to determine optimal parameter values of the Hadoop execution environment.
Date: Friday, July 21, 2017
Time: 1:30 PM
Place: PGH 501D
Advisors: Dr. Edgar Gabriel, Dr. Jaspal Subhlok
Faculty, students, and the general public are invited.