Optimization and Performance Evaluation of the IDR Iterative Krylov Solver on GPUs

TitleOptimization and Performance Evaluation of the IDR Iterative Krylov Solver on GPUs
Publication TypeJournal Article
Year of Publication2018
AuthorsAnzt, H., M. Kreutzer, E. Ponce, G. D. Peterson, G. Wellein, and J. Dongarra
JournalThe International Journal of High Performance Computing Applications
Volume32
Number2
Pagination220–230
Date Published03-2018
Keywordsco-design, gpu, Induced dimension reduction (IDR), kernel fusion, kernel overlap, roofline performance model
Abstract

In this paper, we present an optimized GPU implementation for the induced dimension reduction algorithm. We improve data locality, combine it with an efficient sparse matrix vector kernel, and investigate the potential of overlapping computation with communication as well as the possibility of concurrent kernel execution. A comprehensive performance evaluation is conducted using a suitable performance model. The analysis reveals efficiency of up to 90%, which indicates that the implementation achieves performance close to the theoretically attainable bound.

DOI10.1177/1094342016646844
Project Tags: 
External Publication Flag: