Export 122 results:
Filters: Author is Azzam Haidar [Clear All Filters]
Comparing Hybrid CPU-GPU and Native GPU-only Acceleration for Linear Algebra,” 2015 SIAM Conference on Applied Linear Algebra, Atlanta, GA, SIAM, October 2015.“
Cholesky Factorization on Batches of Matrices with Fixed and Variable Sizes , San Jose, CA, GPU Technology Conference (GTC16), Poster, April 2016.
Cholesky Across Accelerators,” 17th IEEE International Conference on High Performance Computing and Communications (HPCC 2015), Elizabeth, NJ, IEEE, August 2015.“
C++ API for Batch BLAS,” SLATE Working Notes, no. 4, ICL-UT-17-12: University of Tennessee, December 2017.“
Batched One-Sided Factorizations of Tiny Matrices Using GPUs: Challenges and Countermeasures,” Journal of Computational Science, vol. 26, pp. 226–236, May 2018. DOI: 10.1016/j.jocs.2018.01.005“
Batched Matrix Computations on Hardware Accelerators Based on GPUs,” 2015 SIAM Conference on Applied Linear Algebra (SIAM LA), Atlanta, GA, SIAM, October 2015.“
Batched matrix computations on hardware accelerators based on GPUs,” International Journal of High Performance Computing Applications, February 2015. DOI: 10.1177/1094342014567546“
Batched Matrix Computations on Hardware Accelerators,” EuroMPI/Asia 2015 Workshop, Bordeaux, France, September 2015.“
Analysis of Dynamically Scheduled Tile Algorithms for Dense Linear Algebra on Multicore Architectures,” University of Tennessee Computer Science Technical Report, UT-CS-11-666, (also Lawn 243), 00 2011.“
Analysis of Dynamically Scheduled Tile Algorithms for Dense Linear Algebra on Multicore Architectures,” Submitted to Concurrency and Computations: Practice and Experience, November 2010.“
Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs,” IEEE Transactions on Parallel and Distributed Systems, vol. 29, issue 12, pp. 2700–2712, December 2018. DOI: 10.1109/TPDS.2018.2842785“
Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-09: Innovative Computing Laboratory, University of Tennessee, September 2018.“
Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,” Parallel Computing, vol. 81, pp. 1–21, January 2019. DOI: 10.1016/j.parco.2018.10.003“
Algebraic Schwarz Preconditioning for the Schur Complement: Application to the Time-Harmonic Maxwell Equations Discretized by a Discontinuous Galerkin Method.,” The Twentieth International Conference on Domain Decomposition Methods, La Jolla, California, February 2011.“
Accelerating the SVD Bi-Diagonalization of a Batch of Small Matrices using GPUs,” Journal of Computational Science, vol. 26, pp. 237–245, May 2018. DOI: 10.1016/j.jocs.2018.01.007“
Accelerating Tensor Contractions in High-Order FEM with MAGMA Batched , Atlanta, GA, SIAM Conference on Computer Science and Engineering (SIAM CSE17), Presentation, March 2017.
Accelerating Tensor Contractions for High-Order FEM on CPUs, GPUs, and KNLs , Gatlinburg, TN, moky Mountains Computational Sciences and Engineering Conference (SMC16), Poster, September 2016.
Accelerating Numerical Dense Linear Algebra Calculations with GPUs,” Numerical Computations with GPUs: Springer International Publishing, pp. 3-28, 2014. DOI: 10.1007/978-3-319-06548-9_1“
Accelerating Linear Algebra with MAGMA , Knoxville, TN, ECP Annual Meeting 2018, Tutorial, February 2018.
Accelerating Eigenvector Computation in the Nonsymmetric Eigenvalue Problem,” VECPAR 2014, Eugene, OR, June 2014.“
3-D parallel frequency-domain visco-acoustic wave modelling based on a hybrid direct/iterative solver,” 73rd EAGE Conference & Exhibition incorporating SPE EUROPEC 2011, Vienna, Austria, 23-26 May, 00 2011.“