Performance Analysis and Design of a Hessenberg Reduction using Stabilized Blocked Elementary Transformations for New Architectures

TitlePerformance Analysis and Design of a Hessenberg Reduction using Stabilized Blocked Elementary Transformations for New Architectures
Publication TypeConference Paper
Year of Publication2015
AuthorsKabir, K., A. Haidar, S. Tomov, and J. Dongarra
Conference NameThe Spring Simulation Multi-Conference 2015 (SpringSim'15), Best Paper Award
Date Published04-2015
Conference LocationAlexandria, VA
KeywordsEigenvalues problem, Hessenberg reduction, Multi/Many-core, Stabilized Elementary Transformations
AbstractThe solution of nonsymmetric eigenvalue problems, Ax = λx, can be accelerated substantially by first reducing A to an upper Hessenberg matrix H that has the same eigenvalues as A. This can be done using Householder orthogonal transformations, which is a well established standard, or stabilized elementary transformations. The latter approach, although having half the flops of the former, has been used less in practice, e.g., on computer architectures with well developed hierarchical memories, because of its memory-bound operations and the complexity in stabilizing it. In this paper we revisit the stabilized elementary transformations approach in the context of new architectures – both multicore CPUs and Xeon Phi coprocessors. We derive for a first time a blocking version of the algorithm. The blocked version reduces the memory-bound operations and we analyze its performance. A performance model is developed that shows the limitations of both approaches. The competitiveness of using stabilized elementary transformations has been quantified, highlighting that it can be 20 to 30% faster on current high-end multicore CPUs and Xeon Phi coprocessors.
Project Tags: