Publications

Show only items where

Author

Type

Term

Year

Keyword

Export 1274 results:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, et al., Accelerating Tensor Contractions in High-Order FEM with MAGMA Batched , Atlanta, GA, SIAM Conference on Computer Science and Engineering (SIAM CSE17), Presentation, March 2017.

(9.29 MB)

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, “Performance, Design, and Autotuning of Batched GEMM for GPUs,” High Performance Computing: 31st International Conference, ISC High Performance 2016, Frankfurt, Germany, June 19-23, 2016, Proceedings, no. 9697: Springer International Publishing, pp. 21–38, 2016.

(1.98 MB)

Abdelfattah, A., M. Al Farhan, C. Brown, M. Gates, D. Sukkari, A. YarKhan, and J. Dongarra, “SLATE Port to AMD and Intel Platforms,” SLATE Working Notes, no. 16, ICL-UT-21-01, April 2021.

(890.75 KB)

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, “Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs,” IEEE Transactions on Parallel and Distributed Systems, vol. 29, issue 12, pp. 2700–2712, December 2018.

(2.53 MB)

Abdelfattah, A., H. Anzt, E. Boman, E. Carson, T. Cojean, J. Dongarra, M. Gates, T. Gruetzmacher, N. J. Higham, S. Li, et al., “A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic,” SLATE Working Notes, no. 15, ICL-UT-20-08: University of Tennessee, July 2020.

(3.98 MB)

Abdelfattah, A., H. Ltaeif, D. Keyes, and J. Dongarra, “Performance optimization of Sparse Matrix-Vector Multiplication for multi-component PDE-based applications using GPUs,” Concurrency and Computation: Practice and Experience, vol. 28, issue 12, pp. 3447 - 3465, May 2016.

(3.21 MB)

Abdelfattah, A., K. Arturov, C. Cecka, J. Dongarra, C. Freitag, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, et al., “C++ API for Batch BLAS,” SLATE Working Notes, no. 04, ICL-UT-17-12: University of Tennessee, December 2017.

(1.89 MB)

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, “On the Development of Variable Size Batched Computation for Heterogeneous Parallel Architectures,” The 17th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2016), IPDPS 2016, Chicago, IL, IEEE, May 2016.

(708.62 KB)

Abdelfattah, A., S. Tomov, and J. Dongarra, “Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices using GPUs,” International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, Springer, Cham, June 2020.

(702.38 KB)

Abdelfattah, A., M. Gates, J. Kurzak, P. Luszczek, and J. Dongarra, “Implementation of the C++ API for Batch BLAS,” SLATE Working Notes, no. 07, ICL-UT-18-04: Innovative Computing Laboratory, University of Tennessee, June 2018.

(1.07 MB)

Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, A. Haidar, I. Karlin, T. Kolev, I. Masliah, and S. Tomov, “Small Tensor Operations on Advanced Architectures for High-Order Applications,” University of Tennessee Computer Science Technical Report, no. UT-EECS-17-749: Innovative Computing Laboratory, University of Tennessee, April 2017.

(1.09 MB)

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Cholesky Factorization on Batches of Matrices with Fixed and Variable Sizes , San Jose, CA, GPU Technology Conference (GTC16), Poster, April 2016.

(480.51 KB)

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, “Factorization and Inversion of a Million Matrices using GPUs: Challenges and Countermeasures,” Procedia Computer Science, vol. 108, pp. 606–615, June 2017.

(643.44 KB)

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, “Performance, Design, and Autotuning of Batched GEMM for GPUs,” University of Tennessee Computer Science Technical Report, no. UT-EECS-16-739: University of Tennessee, February 2016.

(1.27 MB)

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, “Performance, Design, and Autotuning of Batched GEMM for GPUs,” The International Supercomputing Conference (ISC High Performance 2016), Frankfurt, Germany, June 2016.

(1.27 MB)

Abdelfattah, A., J. Dongarra, A. Haidar, S. Tomov, and I. Yamazaki, MATEDOR: MAtrix, TEnsor, and Deep-learning Optimized Routines , Dallas, TX, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC18), Research Poster, November 2018.

(2.55 MB)

Abdelfattah, A., T. Costa, J. Dongarra, M. Gates, A. Haidar, S. Hammarling, N. J. Higham, J. Kurzak, P. Luszczek, S. Tomov, et al., “A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines,” ACM Transactions on Mathematical Software (TOMS), vol. 47, no. 3, pp. 1–23, 2021.

Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, et al., “High-Performance Tensor Contractions for GPUs,” University of Tennessee Computer Science Technical Report, no. UT-EECS-16-738: University of Tennessee, January 2016.

(2.36 MB)

Abdelfattah, A., J. Dongarra, D. Keyes, and H. Ltaeif, “Optimizing Memory-Bound Numerical Kernels on GPU Hardware Accelerators,” VECPAR 2012, Kobe, Japan, July 2012.

(737.28 KB)

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Tensor Contractions using Optimized Batch GEMM Routines , San Jose, CA, GPU Technology Conference (GTC), Poster, March 2018.

(1.64 MB)

Abalenkovs, M., N. Bagherpour, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Relton, J. Sistek, D. Stevens, et al., “PLASMA 17 Performance Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-11: University of Tennessee, June 2017.

(7.57 MB)

Abalenkovs, M., N. Bagherpour, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Relton, J. Sistek, D. Stevens, et al., “PLASMA 17.1 Functionality Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-10: University of Tennessee, June 2017.

(1.8 MB)

Abalenkovs, M., A. Abdelfattah, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, I. Yamazaki, and A. YarKhan, “Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems,” Supercomputing Frontiers and Innovations, vol. 2, no. 4, October 2015.

(3.68 MB)

, “The Future of Supercomputing: An Interim Report,” National Research Council, Washington, D.C., The National Academies Press, January 2003.

Main menu

Publications

Pages