In Partial Fulfillment of the Requirements for the Degree of

Master of Science

Will defend his thesis

Scientific kernels such as FFT, BLAS and Stencils have been subjects of active research for many decades now. Performance of numerical algorithms could be interpreted by the GFlops count, as this reflects the total number of floating-point arithmetic operations per unit time on a particular hardware under consideration. Since these results expose the actual throughput of the underlying hardware, most of the times these numerical algorithms are used as benchmarks to evaluate the performance of a machine. A careful study of applications and their electrical energy consumption under varying input conditions is required to achieve power at the exascale level. Designs of power efficient processor chips and dynamically controlling processor frequency have been employed to throttle machine energy. However, there is also a need to introduce electrical energy consumption as an evaluation metric, to have a clear idea about the performance of an application per watt or performance per dollar spent on energy. This is important especially when supercomputing facilities are known to invest millions of dollar every year on electricity.

This thesis provides insights on designing efficient processors with respect to energy consumption factor, and allows compilers to apply economic scheduling techniques to the applications running on a heterogeneous computing environment. The work in this thesis discusses the energy efficiency of some kernels such as FFT, DGEMM, Stencils and Pseudo Random Number Generators, that are widely used in various disciplines of high performance computing. A power analyzer has been used to analyze/extract the electrical power usage information of the multi-GPU node under inspection. An API was written in order to remotely interface with the analyzer and get the instantaneous power readings. The results show that the power/energy behavior of different application kernels reflect their computation-communication patterns. Conversely, it is possible to provide a reasonable estimation of power/energy characteristics of a given application, if the computation/communication overhead could be determined.

**Date:** Friday, April 20, 2012

**Time:** 10:00 AM

**Place:** 550-PGH

Faculty, students, and the general public are invited.

Advisor: Prof. Barbara Chapman