Publications

Search

Show only items where

Author

Type

Term

Year

Keyword

Export 371 results:

Filters: First Letter Of Last Name is A [Clear All Filters]

2024

Anzt, H., A. Huebl, and X. S. Li, “Then and Now: Improving Software Portability, Productivity, and 100× Performance,” Computing in Science & Engineering, pp. 1 - 10, April 2024.

2023

Thiyagalingam, J., G. von Laszewski, J. Yin, M. Emani, J. Papay, G. Barrett, P. Luszczek, A. Tsaris, C. Kirkpatrick, F. Wang, et al., “AI Benchmarking for Science: Efforts from the MLCommons Science Working Group,” Lecture Notes in Computer Science, vol. 13387: Springer International Publishing, pp. 47 - 64, January 2023.

Hoefler, T., B. Stevens, A. F. Prein, J. Baehr, T. Schulthess, T. F. Stocker, J. Taylor, D. Klocke, P. Manninen, P. M. Forster, et al., Earth Virtualization Engines - A Technical Perspective , September 2023.

Abdelfattah, A., S. Tomov, P. Luszczek, H. Anzt, and J. Dongarra, “GPU-based LU Factorization and Solve on Batches of Matrices with Band Structure,” SC-W 2023: Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023.

Abdelfattah, A., S. Tomov, P. Luszczek, H. Anzt, and J. Dongarra, “GPU-based LU Factorization and Solve on Batches of Matrices with Band Structure,” SC-W 2023: Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023.

Tsai, Y-H. Mike, N. Beams, and H. Anzt, “Mixed Precision Algebraic Multigrid on GPUs,” Parallel Processing and Applied Mathematics (PPAM 2022), vol. 13826, Cham, Springer International Publishing, April 2023.

Sid-Lakhdar, W., S. Cayrols, D. Bielich, A. Abdelfattah, P. Luszczek, M. Gates, S. Tomov, H. Johansen, D. Williams-Young, T. Davis, et al., “PAQR: Pivoting Avoiding QR factorization,” 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS), St. Petersburg, FL, USA, IEEE, 2023.

Sid-Lakhdar, W., S. Cayrols, D. Bielich, A. Abdelfattah, P. Luszczek, M. Gates, S. Tomov, H. Johansen, D. Williams-Young, T. Davis, et al., “PAQR: Pivoting Avoiding QR factorization,” 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS), St. Petersburg, FL, USA, IEEE, 2023.

Ribizel, T., and H. Anzt, “Parallel Symbolic Cholesky Factorization,” SC-W 2023: Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023.

Aggarwal, I., P. Nayak, A. Kashi, and H. Anzt, “Preconditioners for Batched Iterative Linear Solvers on GPUs,” Smoky Mountains Computational Sciences and Engineering Conference, vol. 169075: Springer Nature Switzerland, pp. 38 - 53, January 2023.

Aggarwal, I., P. Nayak, A. Kashi, and H. Anzt, “Preconditioners for Batched Iterative Linear Solvers on GPUs,” Smoky Mountains Computational Sciences and Engineering Conference, vol. 169075: Springer Nature Switzerland, pp. 38 - 53, January 2023.

Aggarwal, I., P. Nayak, A. Kashi, and H. Anzt, “Preconditioners for Batched Iterative Linear Solvers on GPUs,” Smoky Mountains Computational Sciences and Engineering Conference, vol. 169075: Springer Nature Switzerland, pp. 38 - 53, January 2023.

Cao, Q., S. Abdulah, H. Ltaief, M. G. Genton, D. Keyes, and G. Bosilca, “Reducing Data Motion and Energy Consumption of Geospatial Modeling Applications Using Automated Precision Conversion,” 2023 IEEE International Conference on Cluster Computing (CLUSTER), Santa Fe, NM, USA, IEEE, November 2023.

Aliaga, J. I., H. Anzt, E. S. Quintana-Orti, and A. E. Thomas, “Sparse matrix-vector and matrix-multivector products for the truncated SVD on graphics processors,” Concurrency and Computation: Practice and Experience, August 2023.

Aliaga, J. I., H. Anzt, E. S. Quintana-Orti, and A. E. Thomas, “Sparse matrix-vector and matrix-multivector products for the truncated SVD on graphics processors,” Concurrency and Computation: Practice and Experience, August 2023.

Sukkari, D., M. Gates, M. Al Farhan, H. Anzt, and J. Dongarra, “Task-Based Polar Decomposition Using SLATE on Massively Parallel Systems with Hardware Accelerators,” SC-W '23: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023.

Tsai, Y-H. Mike, N. Beams, and H. Anzt, “Three-precision algebraic multigrid on GPUs,” Future Generation Computer Systems, July 2023.

Grützmacher, T., H. Anzt, and E. S. Quintana‐Ortí, “Using Ginkgo's memory accessor for improving the accuracy of memory‐bound low precision BLAS,” Software: Practice and Experience, vol. 532, issue 1, pp. 81 - 98, January Jan.

2022

Abdulah, S., Q. Cao, Y. Pei, G. Bosilca, J. Dongarra, M. G. Genton, D. E. Keyes, H. Ltaief, and Y. Sun, “Accelerating Geostatistical Modeling and Prediction With Mixed-Precision Computations: A High-Productivity Approach With PaRSEC,” IEEE Transactions on Parallel and Distributed Systems, vol. 33, issue 4, pp. 964 - 976, April 2022.

Abdelfattah, A., P. Ghysels, W. Boukaram, S. Tomov, X. Sherry Li, and J. Dongarra, “Addressing Irregular Patterns of Matrix Computations on GPUs and Their Impact on Applications Powered by Sparse Direct Solvers,” 2022 International Conference for High Performance Computing, Networking, Storage and Analysis (SC22), Dallas, TX, IEEE Computer Society, pp. 354-367, November 2022.

(1.57 MB)

Ayala, A., S. Tomov, P. Luszczek, S. Cayrols, G. Ragghianti, and J. Dongarra, “Analysis of the Communication and Computation Cost of FFT Libraries towards Exascale,” ICL Technical Report, no. ICL-UT-22-07: Innovative Computing Laboratory, July 2022.

(5.91 MB)

Anzt, H., M. Casas, C. I. Malossi, E. S. Quintana-Ortí, F. Scheidegger, and S. Zhuang, “Approximate Computing for Scientific Applications,” Approximate Computing Techniques, 322: Springer International Publishing, pp. 415 - 465, January 2022.

Abdelfattah, A., S. Tomov, and J. Dongarra, “Batch QR Factorization on GPUs: Design, Optimization, and Tuning,” Lecture Notes in Computer Science, vol. 13350, Cham, Springer International Publishing, June 2022.

Kashi, A., P. Nayak, D. Kulkarni, A. Scheinberg, P. Lin, and H. Anzt, “Batched sparse iterative solvers on GPU for the collision operator for fusion plasma simulations,” 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Lyon, France, IEEE, July 2022.

(1.26 MB)

Alomairy, R., M. Gates, S. Cayrols, D. Sukkari, K. Akbudak, A. YarKhan, P. Bagwell, and J. Dongarra, “Communication Avoiding LU with Tournament Pivoting in SLATE,” SLATE Working Notes, no. 18, ICL-UT-22-01, January 2022.

(3.74 MB)

Alomairy, R., M. Gates, S. Cayrols, D. Sukkari, K. Akbudak, A. YarKhan, P. Bagwell, and J. Dongarra, “Communication Avoiding LU with Tournament Pivoting in SLATE,” SLATE Working Notes, no. 18, ICL-UT-22-01, January 2022.

(3.74 MB)

Aliaga, J. I., H. Anzt, T. Grützmacher, E. S. Quintana-Ortí, and A. E. Thomas, “Compressed basis GMRES on high-performance graphics processing units,” The International Journal of High Performance Computing Applications, May 2022.

(13.52 MB)

Aliaga, J. I., H. Anzt, T. Grützmacher, E. S. Quintana-Ortí, and A. E. Thomas, “Compressed basis GMRES on high-performance graphics processing units,” The International Journal of High Performance Computing Applications, May 2022.

(13.52 MB)

Aliaga, J. I., H. Anzt, T. Grützmacher, E. S. Quintana-Orti, and A. E. Thomas, “Compression and load balancing for efficient sparse matrix‐vector product on multicore processors and graphics processing units,” Concurrency and Computation: Practice and Experience, vol. 34, issue 14, June 2022.

(749.82 KB)

Aliaga, J. I., H. Anzt, T. Grützmacher, E. S. Quintana-Orti, and A. E. Thomas, “Compression and load balancing for efficient sparse matrix‐vector product on multicore processors and graphics processing units,” Concurrency and Computation: Practice and Experience, vol. 34, issue 14, June 2022.

(749.82 KB)

Sid-Lakhdar, W. M., M. Aznaveh, P. Luszczek, and J. Dongarra, “Deep Gaussian process with multitask and transfer learning for performance optimization,” 2022 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1-7, September 2022.

Ayala, A., S. Tomov, P. Luszczek, S. Cayrols, G. Ragghianti, and J. Dongarra, “FFT Benchmark Performance Experiments on Systems Targeting Exascale,” ICL Technical Report, no. ICL-UT-22-02, March 2022.

(5.87 MB)

Cao, Q., R. Alomairy, Y. Pei, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, “A Framework to Exploit Data Sparsity in Tile Low-Rank Cholesky Factorization,” IEEE International Parallel and Distributed Processing Symposium (IPDPS), July 2022.

(1.03 MB)

Anzt, H., T. Cojean, G. Flegar, F. Göbel, T. Grützmacher, P. Nayak, T. Ribizel, Y. Mike Tsai, and E. S. Quintana-Ortí, “Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing,” ACM Transactions on Mathematical Software, vol. 48, issue 12, pp. 1 - 33, March 2022.

(4.2 MB)

Cojean, T., Y-H. Mike Tsai, and H. Anzt, “Ginkgo—A math library designed for platform portability,” Parallel Computing, vol. 111, pp. 102902, February 2022.

Cayrols, S., J. Li, G. Bosilca, S. Tomov, A. Ayala, and J. Dongarra, “Lossy all-to-all exchange for accelerating parallel 3-D FFTs on hybrid architectures with GPUs,” 2022 IEEE International Conference on Cluster Computing (CLUSTER), pp. 152-160, September 2022.

Cayrols, S., J. Li, G. Bosilca, S. Tomov, A. Ayala, and J. Dongarra, “Mixed precision and approximate 3D FFTs: Speed for accuracy trade-off with GPU-aware MPI and run-time data compression,” ICL Technical Report, no. ICL-UT-22-04, May 2022.

(706.14 KB)

Sid-Lakhdar, W. M., S. Cayrols, D. Bielich, A. Abdelfattah, P. Luszczek, M. Gates, S. Tomov, H. Johansen, D. Williams-Young, T. A. Davis, et al., “PAQR: Pivoting Avoiding QR factorization,” ICL Technical Report, no. ICL-UT-22-06, June 2022.

(364.85 KB)

Ayala, A., S. Tomov, M. Stoyanov, A. Haidar, and J. Dongarra, “Performance Analysis of Parallel FFT on Large Multi-GPU Systems,” 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Lyon, France, IEEE, August 2022.

Tsai, Y. M., T. Cojean, and H. Anzt, “Porting Sparse Linear Algebra to Intel GPUs,” Euro-Par 2021: Parallel Processing Workshops, vol. 13098, Lisbon, Portugal, Springer International Publishing, pp. 57 - 68, June 2022.

Funk, Y., M. Götz, and H. Anzt, “Prediction of Optimal Solvers for Sparse Linear Systems Using Deep Learning,” 2022 SIAM Conference on Parallel Processing for Scientific Computing (PP), Philadelphia, PA, Society for Industrial and Applied Mathematics, pp. 14 - 24.

Tsai, Y-H. M., T. Cojean, and H. Anzt, “Providing performance portable numerics for Intel GPUs,” Concurrency and Computation: Practice and Experience, vol. 17, October 2022.

(3.16 MB)

Cao, Q., S. Abdulah, R. Alomairy, Y. Pei, P. Nag, G. Bosilca, J. Dongarra, M. G. Genton, D. Keyes, H. Ltaief, et al., “Reshaping Geostatistical Modeling and Prediction for Extreme-Scale Environmental Applications,” 2022 International Conference for High Performance Computing, Networking, Storage and Analysis (SC22), Dallas, TX, IEEE Press, November 2022.

Cao, Q., S. Abdulah, R. Alomairy, Y. Pei, P. Nag, G. Bosilca, J. Dongarra, M. G. Genton, D. Keyes, H. Ltaief, et al., “Reshaping Geostatistical Modeling and Prediction for Extreme-Scale Environmental Applications,” 2022 International Conference for High Performance Computing, Networking, Storage and Analysis (SC22), Dallas, TX, IEEE Press, November 2022.

Agullo, E., M. Altenbernd, H. Anzt, L. Bautista-Gomez, T. Benacchio, L. Bonaventura, H-J. Bungartz, S. Chatterjee, F. M. Ciorba, N. DeBardeleben, et al., “Resiliency in numerical algorithm design for extreme scale simulations,” The International Journal of High Performance Computing Applications, vol. 36371337212766180823, issue 2, pp. 251 - 285, March 2022.

Agullo, E., M. Altenbernd, H. Anzt, L. Bautista-Gomez, T. Benacchio, L. Bonaventura, H-J. Bungartz, S. Chatterjee, F. M. Ciorba, N. DeBardeleben, et al., “Resiliency in numerical algorithm design for extreme scale simulations,” The International Journal of High Performance Computing Applications, vol. 36371337212766180823, issue 2, pp. 251 - 285, March 2022.

Agullo, E., M. Altenbernd, H. Anzt, L. Bautista-Gomez, T. Benacchio, L. Bonaventura, H-J. Bungartz, S. Chatterjee, F. M. Ciorba, N. DeBardeleben, et al., “Resiliency in numerical algorithm design for extreme scale simulations,” The International Journal of High Performance Computing Applications, vol. 36371337212766180823, issue 2, pp. 251 - 285, March 2022.

2021

Ayala, A., S. Tomov, A. Haidar, M.. Stoyanov, S. Cayrols, J. Li, G. Bosilca, and J. Dongarra, Accelerating FFT towards Exascale Computing : NVIDIA GPU Technology Conference (GTC2021), 2021.

(27.23 MB)

Ayala, A., S. Tomov, M. Stoyanov, A. Haidar, and J. Dongarra, “Accelerating Multi - Process Communication for Parallel 3-D FFT,” 2021 Workshop on Exascale MPI (ExaMPI), St. Louis, MO, USA, IEEE, December 2021.

Kolev, T., P. Fischer, M. Min, J. Dongarra, J. Brown, V. Dobrev, T. Warburton, S. Tomov, M. S. Shephard, A. Abdelfattah, et al., “Efficient exascale discretizations: High-order finite element methods,” The International Journal of High Performance Computing Applications, pp. 10943420211020803, 2021.

Pages