Publications

Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Orti, “Batched Gauss-Jordan Elimination for Block-Jacobi Preconditioner Generation on GPUs,” Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, New York, NY, USA, ACM, pp. 1–10, February 2017.

(552.62 KB)

Anzt, H., D. Lukarski, S. Tomov, and J. Dongarra, “Self-Adaptive Multiprecision Preconditioners on Multicore and Manycore Architectures,” VECPAR 2014, Eugene, OR, June 2014.

(430.56 KB)

Anzt, H., I. Yamazaki, M. Hoemmen, E. Boman, and J. Dongarra, “Solver Interface & Performance on Cori,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-05: University of Tennessee, June 2018.

(188.05 KB)

Anzt, H., J. Dongarra, and E. S. Quintana-Orti, “Tuning Stationary Iterative Solvers for Fault Resilience,” 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA15), Austin, TX, ACM, November 2015.

(1.28 MB)

Anzt, H., T. Ribizel, G. Flegar, E. Chow, and J. Dongarra, “ParILUT – A Parallel Threshold ILU for GPUs,” IEEE International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.

(505.95 KB)

Anzt, H., E. Chow, T. Huckle, and J. Dongarra, “Batched Generation of Incomplete Sparse Approximate Inverses on GPUs,” Proceedings of the 7th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, pp. 49–56, November 2016.

Anzt, H., S. Tomov, M. Gates, J. Dongarra, and V. Heuveline, “Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems,” ICCS 2012, Omaha, NE, June 2012.

(608.95 KB)

Anzt, H., T. Huckle, J. Bräckle, and J. Dongarra, “Incomplete Sparse Approximate Inverses for Parallel Preconditioning,” Parallel Computing, vol. 71, pp. 1–22, January 2018.

(1.24 MB)

Anzt, H., S. Tomov, and J. Dongarra, “Energy Efficiency and Performance Frontiers for Sparse Computations on GPU Supercomputers,” Sixth International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM '15), San Francisco, CA, ACM, February 2015.

(2.29 MB)

Anzt, H., G. Flegar, T. Gruetzmacher, and E. S. Quintana-Orti, “Toward a Modular Precision Ecosystem for High-Performance Computing,” The International Journal of High Performance Computing Applications, vol. 33, issue 6, pp. 1069-1078, November 2019.

(1.93 MB)

Anzt, H., T. Cojean, C. Yen-Chen, J. Dongarra, G. Flegar, P. Nayak, S. Tomov, Y. M. Tsai, and W. Wang, “Load-Balancing Sparse Matrix Vector Product Kernels on GPUs,” ACM Transactions on Parallel Computing, vol. 7, issue 1, March 2020.

(5.67 MB)

Anzt, H., J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, “Efficiency of General Krylov Methods on GPUs – An Experimental Study,” The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), Chicago, IL, IEEE, May 2016.

(285.28 KB)

Anzt, H., E. Chow, D. Szyld, and J. Dongarra, “Domain Overlap for Iterative Sparse Triangular Solves on GPUs,” Software for Exascale Computing - SPPEXA, vol. 113: Springer International Publishing, pp. 527–545, September 2016.

Anzt, H., P. Luszczek, J. Dongarra, and V. Heuveline, “GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement,” EuroPar 2012 (also LAWN 260), Rhodes Island, Greece, August 2012.

(662.98 KB)

Anzt, H., E. Chow, and J. Dongarra, “ParILUT - A New Parallel Threshold ILU,” SIAM Journal on Scientific Computing, vol. 40, issue 4: SIAM, pp. C503–C519, July 2018.

(19.26 MB)

Anzt, H., E. Chow, D. Szyld, and J. Dongarra, “Random-Order Alternating Schwarz for Sparse Triangular Solves,” 2015 SIAM Conference on Applied Linear Algebra (SIAM LA), Atlanta, GA, SIAM, October 2015.

(1.53 MB)

Anzt, H., and G. Flegar, “Are we Doing the Right Thing? – A Critical Analysis of the Academic HPC Community,” 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, IEEE, May 2019.

(622.32 KB)

Anzt, H., S. Tomov, and J. Dongarra, “On the performance and energy efficiency of sparse linear algebra on GPUs,” International Journal of High Performance Computing Applications, October 2016.

(1.19 MB)

Anzt, H., P. Luszczek, J. Dongarra, and V. Heuveline, “GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement,” University of Tennessee Computer Science Technical Report UT-CS-11-690 (also Lawn 260), December 2011.

(662.98 KB)

Anzt, H., E. Boman, J. Dongarra, G. Flegar, M. Gates, M. Heroux, M. Hoemmen, J. Kurzak, P. Luszczek, S. Rajamanickam, et al., “MAGMA-sparse Interface Design Whitepaper,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-05, September 2017.

(1.28 MB)

Anzt, H., S. Tomov, and J. Dongarra, “Implementing a Sparse Matrix Vector Product for the SELL-C/SELL-C-σ formats on NVIDIA GPUs,” University of Tennessee Computer Science Technical Report, no. UT-EECS-14-727: University of Tennessee, April 2014.

(578.11 KB)

Anzt, H., J. Dongarra, G. Flegar, and T. Gruetzmacher, “Variable-Size Batched Condition Number Calculation on GPUs,” SBAC-PAD, Lyon, France, September 2018.

(509.3 KB)

Anzt, H., T. Cojean, Y-C. Chen, F. Goebel, T. Gruetzmacher, P. Nayak, T. Ribizel, and Y-H. Tsai, “Ginkgo: A High Performance Numerical Linear Algebra Library,” Journal of Open Source Software, vol. 5, issue 52, August 2020.

(721.84 KB)

Anzt, H., S. Tomov, and J. Dongarra, “Accelerating the LOBPCG method on GPUs using a blocked Sparse Matrix Vector Product,” Spring Simulation Multi-Conference 2015 (SpringSim'15), Alexandria, VA, SCS, April 2015.

(1.46 MB)

Anzt, H., E. Chow, J. Saak, and J. Dongarra, “Updating Incomplete Factorization Preconditioners for Model Order Reduction,” Numerical Algorithms, vol. 73, issue 3, no. 3, pp. 611–630, February 2016.

(565.34 KB)

Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, “A Block-Asynchronous Relaxation Method for Graphics Processing Units,” Journal of Parallel and Distributed Computing, vol. 73, issue 12, pp. 1613–1626, December 2013.

(1.08 MB)

Anzt, H., J. Dongarra, M. Gates, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, “Bringing High Performance Computing to Big Data Algorithms,” Handbook of Big Data Technologies: Springer, 2017.

(1.22 MB)

Anzt, H., E. Chow, and J. Dongarra, “Iterative Sparse Triangular Solves for Preconditioning,” EuroPar 2015, Vienna, Austria, Springer Berlin, August 2015.

(322.36 KB)

Anzt, H., J. Dongarra, M. Gates, A. Haidar, K. Kabir, P. Luszczek, S. Tomov, and I. Yamazaki, MAGMA MIC: Optimizing Linear Algebra for Intel Xeon Phi , Frankfurt, Germany, ISC High Performance (ISC15), Intel Booth Presentation, June 2015.

(2.03 MB)

Anzt, H., N. Beams, T. Cojean, F. Göbel, T. Grützmacher, A. Kashi, P. Nayak, T. Ribizel, and Y. M. Tsai, Gingko: A Sparse Linear Algebrea Library for HPC : 2021 ECP Annual Meeting, April 2021.

(893.04 KB)

Anzt, H., J. Dongarra, and E. S. Quintana-Orti, “Fine-grained Bit-Flip Protection for Relaxation Methods,” Journal of Computational Science, November 2016.

(1.47 MB)

Anzt, H., S. Tomov, M. Gates, J. Dongarra, and V. Heuveline, Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems , no. UT-CS-11-689, December 2011.

(608.95 KB)

Anzt, H., J. Dongarra, G. Flegar, E. S. Quintana-Orti, and A. E. Thomas, “Variable-Size Batched Gauss-Huard for Block-Jacobi Preconditioning,” International Conference on Computational Science (ICCS 2017), vol. 108, Zurich, Switzerland, Procedia Computer Science, pp. 1783-1792, June 2017.

(512.57 KB)

Anzt, H., S. Tomov, and J. Dongarra, “Accelerating the LOBPCG method on GPUs using a blocked Sparse Matrix Vector Product,” University of Tennessee Computer Science Technical Report, no. UT-EECS-14-731: University of Tennessee, October 2014.

(1.83 MB)

Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Orti, “Variable-Size Batched Gauss-Jordan Elimination for Block-Jacobi Preconditioning on Graphics Processors,” Parallel Computing, vol. 81, pp. 131-146, January 2019.

(1.9 MB)

Anzt, H., G. Collins, J. Dongarra, G. Flegar, and E. S. Quintana-Orti, “Flexible Batched Sparse Matrix-Vector Product on GPUs,” 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA '17), Denver, CO, ACM Press, November 2017.

(583.4 KB)

Anzt, H., E. Ponce, G. D. Peterson, and J. Dongarra, “GPU-accelerated Co-design of Induced Dimension Reduction: Algorithmic Fusion and Kernel Overlap,” 2nd International Workshop on Hardware-Software Co-Design for High Performance Computing, Austin, TX, ACM, November 2015.

(1.46 MB)

Anzt, H., M. Baboulin, J. Dongarra, Y. Fournier, F. Hulsemann, A. Khabou, and Y. Wang, “Accelerating the Conjugate Gradient Algorithm with GPU in CFD Simulations,” VECPAR, 2016.

Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, “Weighted Block-Asynchronous Iteration on GPU-Accelerated Systems,” Tenth International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (Best Paper), Rhodes Island, Greece, August 2012.

(764.02 KB)

Anzt, H., G. Collins, J. Dongarra, G. Flegar, and E. S. Quintana-Orti, Flexible Batched Sparse Matrix Vector Product on GPUs , Denver, Colorado, ScalA'17: 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, November 2017.

(16.8 MB)

Anzt, H., B. Haugen, J. Kurzak, P. Luszczek, and J. Dongarra, “Experiences in autotuning matrix multiplication for energy minimization on GPUs,” Concurrency in Computation: Practice and Experience, vol. 27, issue 17, pp. 5096-5113, December 2015.

(1.98 MB)

Anzt, H., J. Dongarra, G. Flegar, N. J. Higham, and E. S. Quintana-Orti, “Adaptive Precision in Block-Jacobi Preconditioning for Iterative Sparse Linear System Solvers,” Concurrency and Computation: Practice and Experience, vol. 31, no. 6, pp. e4460, March 2019.

(341.54 KB)

Anzt, H., T. Cojean, Y-C. Chen, F. Goebel, T. Gruetzmacher, P. Nayak, T. Ribizel, Y-H. Tsai, and J. Dongarra, Ginkgo: A Node-Level Sparse Linear Algebra Library for HPC (Poster) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.

(699 KB)

Anzt, H., E. Chow, and J. Dongarra, “On block-asynchronous execution on GPUs,” LAPACK Working Note, no. 291, November 2016.

(1.05 MB)

Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, “A Block-Asynchronous Relaxation Method for Graphics Processing Units,” University of Tennessee Computer Science Technical Report, no. UT-CS-11-687 / LAWN 258, November 2011.

(1.08 MB)

Anzt, H., M. Gates, J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, “Preconditioned Krylov Solvers on GPUs,” Parallel Computing, June 2017.

(1.19 MB)

Anzt, H., and E. S. Quintana-Orti, “Improving the Energy Efficiency of Sparse Linear System Solvers on Multicore and Manycore Systems,” Philosophical Transactions of the Royal Society A -- Mathematical, Physical and Engineering Sciences, vol. 372, issue 2018, July 2014.

(779.57 KB)

Anzt, H., M. Kreutzer, E. Ponce, G. D. Peterson, G. Wellein, and J. Dongarra, “Optimization and Performance Evaluation of the IDR Iterative Krylov Solver on GPUs,” The International Journal of High Performance Computing Applications, vol. 32, no. 2, pp. 220–230, March 2018.

(2.08 MB)

Anzt, H., J. Dongarra, and E. S. Quintana-Orti, “Adaptive Precision Solvers for Sparse Linear Systems,” 3rd International Workshop on Energy Efficient Supercomputing (E2SC '15), Austin, TX, ACM, November 2015.

Anzt, H., T. Cojean, and E. Kuhn, “Towards a New Peer Review Concept for Scientific Computing ensuring Technical Quality, Software Sustainability, and Result Reproducibility,” Proceedings in Applied Mathematics and Mechanics, vol. 19, issue 1, November 2019.

Main menu

Publications

Pages