Experiences in Autotuning Matrix Multiplication for Energy Minimization on GPUs

TitleExperiences in Autotuning Matrix Multiplication for Energy Minimization on GPUs
Publication TypeJournal Article
Year of Publication2015
AuthorsAnzt, H., B. Haugen, J. Kurzak, P. Luszczek, and J. Dongarra
JournalConcurrency and Computation: Practice and Experience
Volume27
Issue17
Pagination5096 - 5113
Date PublishedOct-12-2015
KeywordsAutotuning, energy efficiency, hardware accelerators, matrix multiplication, power
AbstractIn this paper, we report extensive results and analysis of autotuning the computationally intensive graphics processing units kernel for dense matrix–matrix multiplication in double precision. In contrast to traditional autotuning and/or optimization for runtime performance only, we also take the energy efficiency into account. For kernels achieving equal performance, we show significant differences in their energy balance. We also identify the memory throughput as the most influential metric that trades off performance and energy efficiency. As a result, the performance optimal case ends up not being the most efficient kernel in overall resource use.
URLhttp://doi.wiley.com/10.1002/cpe.3516https://api.wiley.com/onlinelibrary/tdm/v1/articles/10.1002%2Fcpe.3516
DOI10.1002/cpe.3516
Short TitleConcurrency Computat.: Pract. Exper.