Publications
Integrating Deep Learning in Domain Sciences at Exascale,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-10: University of Tennessee, August 2020.
(1.09 MB)
“
Integrating Deep Learning in Domain Sciences at Exascale,”
2020 Smoky Mountains Computational Sciences and Engineering Conference (SMC 2020), August 2020.
“Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices using GPUs,”
International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, Springer, Cham, June 2020.
DOI: 10.1007/978-3-030-50417-5_18
(702.38 KB)
“
Load-Balancing Sparse Matrix Vector Product Kernels on GPUs,”
ACM Transactions on Parallel Computing, vol. 7, issue 1, March 2020.
DOI: 10.1145/3380930
(5.67 MB)
“
MAGMA Templates for Scalable Linear Algebra on Emerging Architectures,”
The International Journal of High Performance Computing Applications, vol. 34, issue 6, pp. 645-658, November 2020.
DOI: 10.1177/1094342020938421
“MATEDOR: MAtrix, TEnsor, and Deep-learning Optimized Routines
, Seattle, WA, 2020 NSF Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Principal Investigator Meeting, February 2020.
(2.28 MB)

Matrix Multiplication on Batches of Small Matrices in Half and Half-Complex Precisions,”
Journal of Parallel and Distributed Computing, vol. 145, pp. 188-201, November 2020.
DOI: 10.1016/j.jpdc.2020.07.001
(1.3 MB)
“
Mixed-Precision Iterative Refinement using Tensor Cores on GPUs to Accelerate Solution of Linear Systems,”
Proceedings of the Royal Society A, vol. 476, issue 2243, November 2020.
DOI: 10.1098/rspa.2020.0110
(2.24 MB)
“
Mixed-Precision Solution of Linear Systems Using Accelerator-Based Computing,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-05: University of Tennessee, May 2020.
(1.03 MB)
“
Reducing the Amount of out-of-core Data Access for GPU-Accelerated Randomized SVD,”
Concurrency and Computation: Practice and Experience, April 2020.
DOI: 10.1002/cpe.5754
(1.43 MB)
“
A Set of Batched Basic Linear Algebra Subprograms,”
ACM Transactions on Mathematical Software, October 2020.
“A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic,”
SLATE Working Notes, no. 15, ICL-UT-20-08: University of Tennessee, July 2020.
(3.98 MB)
“
Translational Process: Mathematical Software Perspective,”
Journal of Computational Science, September 2020.
DOI: 10.1016/j.jocs.2020.101216
(752.59 KB)
“
Translational Process: Mathematical Software Perspective,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-11, August 2020.
(752.59 KB)
“
A More Portable HeFFTe: Implementing a Fallback Algorithm for Scalable Fourier Transforms,”
ICL Technical Report, no. ICL-UT-21-04: University of Tennessee, August 2021.
(493.17 KB)
“
FFT Benchmark Performance Experiments on Systems Targeting Exascale,”
ICL Technical Report, no. ICL-UT-22-02, March 2022.
(5.87 MB)
“
Mixed precision and approximate 3D FFTs: Speed for accuracy trade-off with GPU-aware MPI and run-time data compression,”
ICL Technical Report, no. ICL-UT-22-03, May 2022.
(706.14 KB)
“