Computer Science Seminar

High Performance Computing at Extreme Scale: Resilience, Energy Efficiency, and Scalability

When: Friday, March 3, 2017
Where: PGH 563
Time: 11:00 AM – Noon

Speaker: Panrou Wu, University of California Riverside

Host: Dr. Gopal Pandurangan

The advancement of science, technology, and society increasingly depends on large scale computation to simulate physical, chemical, biological, and social processes and to analyze huge amount of data at high speed. Effective exploitation of parallelism is essential for high performance, which is demanded by insatiable desire for ever more accurate simulation in scientific computing and increasingly larger data and more complex analytics. However extreme scale computing brings about new challenges, among them are resilience, energy efficiency, and scalability. To face these challenges, in this talk I present an application and system co-design approach that exploits algorithmic insights to form better synergies between applications and systems. First, I will present my recent explorations into exploiting algorithmic structures of matrix operations to devise numerical linear algebra libraries that are resilient to soft/hard errors and are energy efficient. Second, I will show a case of designing hybrid non-volatile/volatile main memory system which exposes mechanisms of data (re)placement to applications. Directed by the patterns of numerical kernels the system is able to achieve significant improvement in performance, energy efficiency, and resilience compared to system managed data (re)placement approaches. The talk concludes with research questions on the computational aspects of high performance computing for scientific computing and big data analytics.

Bio:

Dr. Panruo Wu recently got his PhD in Computer Science from University of California Riverside. His research interests include high performance computing (HPC), big data, and parallel/distributed system. He has collaborated with several national laboratories and universities and published in HPC and parallel computing venues such as ACM/IEEE SC, ACM HPDC, ACM PPoPP, IEEE Trans. Signal Processing, IEEE Trans. on Parallel and Distributed Systems. He interned at Los Alamos National Laboratory, Oak Ridge National Laboratory, and Amazon.

Besides playing with computers that can talk with each other, he enjoys playing basketball, hiking, and reading.

Department of Computer Science

Events

Computer Science Seminar

High Performance Computing at Extreme Scale: Resilience, Energy Efficiency, and Scalability