%0 Journal Article
%J Parallel Computing
%D 2017
%T Preconditioned Krylov Solvers on GPUs
%A Hartwig Anzt
%A Mark Gates
%A Jack Dongarra
%A Moritz Kreutzer
%A Gerhard Wellein
%A Martin Kohler
%K gpu
%K ILU
%K Jacobi
%K Krylov solvers
%K Preconditioning
%X In this paper, we study the effect of enhancing GPU-accelerated Krylov solvers with preconditioners. We consider the BiCGSTAB, CGS, QMR, and IDR(s) Krylov solvers. For a large set of test matrices, we assess the impact of Jacobi and incomplete factorization preconditioning on the solvers’ numerical stability and time-to-solution performance. We also analyze how the use of a preconditioner impacts the choice of the fastest solver.
%B Parallel Computing
%8 06-2017
%G eng
%U http://www.sciencedirect.com/science/article/pii/S0167819117300777
%! Parallel Computing
%R 10.1016/j.parco.2017.05.006
%0 Conference Proceedings
%B 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
%D 2016
%T Efficiency of General Krylov Methods on GPUs – An Experimental Study
%A Hartwig Anzt
%A Jack Dongarra
%A Moritz Kreutzer
%A Gerhard Wellein
%A Martin Kohler
%K algorithmic bombardment
%K BiCGSTAB
%K CGS
%K Convergence
%K Electric breakdown
%K gpu
%K graphics processing units
%K Hardware
%K IDR(s)
%K Krylov solver
%K Libraries
%K linear systems
%K QMR
%K Sparse matrices
%X This paper compares different Krylov methods based on short recurrences with respect to their efficiency whenimplemented on GPUs. The comparison includes BiCGSTAB, CGS, QMR, and IDR using different shadow space dimensions. These methods are known for their good convergencecharacteristics. For a large set of test matrices taken from theUniversity of Florida Matrix Collection, we evaluate the methods'performance against different target metrics: convergence, number of sparse matrix-vector multiplications, and executiontime. We also analyze whether the methods are "orthogonal"in terms of problem suitability. We propose best practicesfor choosing methods in a "black box" scenario, where noinformation about the optimal solver is available.
%B 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
%P 683-691
%8 05-2016
%G eng
%R 10.1109/IPDPSW.2016.45
%0 Conference Paper
%B The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES)
%D 2016
%T Efficiency of General Krylov Methods on GPUs – An Experimental Study
%A Hartwig Anzt
%A Jack Dongarra
%A Moritz Kreutzer
%A Gerhard Wellein
%A Martin Kohler
%K algorithmic bombardment
%K BiCGSTAB
%K CGS
%K gpu
%K IDR(s)
%K Krylov solver
%K QMR
%X This paper compares different Krylov methods based on short recurrences with respect to their efficiency when implemented on GPUs. The comparison includes BiCGSTAB, CGS, QMR, and IDR using different shadow space dimensions. These methods are known for their good convergence characteristics. For a large set of test matrices taken from the University of Florida Matrix Collection, we evaluate the methods’ performance against different target metrics: convergence, number of sparse matrix-vector multiplications, and execution time. We also analyze whether the methods are “orthogonal” in terms of problem suitability. We propose best practices for choosing methods in a “black box” scenario, where no information about the optimal solver is available.
%B The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES)
%I IEEE
%C Chicago, IL
%8 05-2016
%G eng
%R 10.1109/IPDPSW.2016.45