Publications
Export 69 results:
Filters: Author is Mark Gates [Clear All Filters]
Computational Benefit of GPU Optimization for Atmospheric Chemistry Modeling,”
Journal of Advances in Modeling Earth Systems, vol. 10, issue 8, pp. 1952–1969, August 2018.
DOI: 10.1029/2018MS001276
(3.4 MB)
“
Comparing Hybrid CPU-GPU and Native GPU-only Acceleration for Linear Algebra,”
2015 SIAM Conference on Applied Linear Algebra, Atlanta, GA, SIAM, October 2015.
(4.7 MB)
“
Clover: Computational Libraries Optimized via Exascale Research
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(872 KB)

clMAGMA: High Performance Dense Linear Algebra with OpenCL ,”
International Workshop on OpenCL, Bristol University, England, May 2014.
(460.91 KB)
“
clMAGMA: High Performance Dense Linear Algebra with OpenCL,”
University of Tennessee Technical Report (Lawn 275), no. UT-CS-13-706: University of Tennessee, March 2013.
(526.6 KB)
“
C++ API for BLAS and LAPACK,”
SLATE Working Notes, no. 2, ICL-UT-17-03: Innovative Computing Laboratory, University of Tennessee, June 2017.
(1.12 MB)
“
C++ API for Batch BLAS,”
SLATE Working Notes, no. 4, ICL-UT-17-12: University of Tennessee, December 2017.
(1.89 MB)
“
Bringing High Performance Computing to Big Data Algorithms,”
Handbook of Big Data Technologies: Springer, 2017.
DOI: 10.1007/978-3-319-49340-4
(1.22 MB)
“
Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems,”
ICCS 2012, Omaha, NE, June 2012.
(608.95 KB)
“
Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems
, no. UT-CS-11-689, December 2011.
(608.95 KB)

Batched BLAS (Basic Linear Algebra Subprograms) 2018 Specification
, July 2018.
(483.05 KB)

Autotuning Numerical Dense Linear Algebra for Batched Computation With GPU Hardware Accelerators,”
Proceedings of the IEEE, vol. 106, issue 11, pp. 2040–2055, November 2018.
DOI: 10.1109/JPROC.2018.2868961
(2.53 MB)
“
Autotuning Batch Cholesky Factorization in CUDA with Interleaved Layout of Matrices,”
Parallel and Distributed Processing Symposium Workshops (IPDPSW), Orlando, FL, IEEE, June 2017.
DOI: 10.1109/IPDPSW.2017.18
“On Algorithmic Variants of Parallel Gaussian Elimination: Comparison of Implementations in Terms of Performance and Numerical Properties,”
University of Tennessee Computer Science Technical Report, no. UT-CS-13-715, July 2013, 2012.
(358.98 KB)
“
Accelerating the SVD Two Stage Bidiagonal Reduction and Divide and Conquer Using GPUs,”
Parallel Computing, vol. 74, pp. 3–18, May 2018.
DOI: 10.1016/j.parco.2017.10.004
(1.34 MB)
“
Accelerating Numerical Dense Linear Algebra Calculations with GPUs,”
Numerical Computations with GPUs: Springer International Publishing, pp. 3-28, 2014.
DOI: 10.1007/978-3-319-06548-9_1
(1.06 MB)
“
Accelerating Linear Algebra with MAGMA
, Knoxville, TN, ECP Annual Meeting 2018, Tutorial, February 2018.
(35.27 MB)

Accelerating Eigenvector Computation in the Nonsymmetric Eigenvalue Problem,”
VECPAR 2014, Eugene, OR, June 2014.
(199.44 KB)
“
Accelerating Collaborative Filtering for Implicit Feedback Datasets using GPUs,”
2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, IEEE, November 2015.
(1.02 MB)
“