%0 Journal Article
%J VECPAR
%D 2016
%T Accelerating the Conjugate Gradient Algorithm with GPU in CFD Simulations
%A Hartwig Anzt
%A Marc Baboulin
%A Jack Dongarra
%A Yvan Fournier
%A Frank Hulsemann
%A Amal Khabou
%A Yushan Wang
%X This paper illustrates how GPU computing can be used to accelerate computational fluid dynamics (CFD) simulations. For sparse linear systems arising from finite volume discretization, we evaluate and optimize the performance of Conjugate Gradient (CG) routines designed for manycore accelerators and compare against an industrial CPU-based implementation. We also investigate how the recent advances in preconditioning, such as iterative Incomplete Cholesky (IC, as symmetric case of ILU) preconditioning, match the requirements for solving real world problems.
%B VECPAR
%G eng
%U http://hgpu.org/?p=16264
%0 Conference Paper
%B International Conference on Computational Science (ICCS 2013)
%D 2013
%T A Parallel Solver for Incompressible Fluid Flows
%A Yushan Wang
%A Marc Baboulin
%A Joël Falcou
%A Yann Fraigneau
%A Olivier Le Maître
%K ADI
%K Navier-Stokes equations
%K Parallel computing
%K Partial diagonalization
%K Prediction-projection
%K SIMD
%X The Navier-Stokes equations describe a large class of fluid flows but are difficult to solve analytically because of their nonlin- earity. We present in this paper a parallel solver for the 3-D Navier-Stokes equations of incompressible unsteady flows with constant coefficients, discretized by the finite difference method. We apply the prediction-projection method which transforms the Navier-Stokes equations into three Helmholtz equations and one Poisson equation. For each Helmholtz system, we apply the Alternating Direction Implicit (ADI) method resulting in three tridiagonal systems. The Poisson equation is solved using partial diagonalization which transforms the Laplacian operator into a tridiagonal one. We describe an implementation based on MPI where the computations are performed on each subdomain and information is exchanged on the interfaces, and where the tridiagonal system solutions are accelerated using vectorization techniques. We present performance results on a current multicore system.
%B International Conference on Computational Science (ICCS 2013)
%I Elsevier B.V.
%C Barcelona, Spain
%8 2013-06
%G eng
%R DOI: 10.1016/j.procs.2013.05.207