Publications
Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Department Technical Report, UT-CS-04-526, vol. –89-95, January 2006.
(6.42 MB)
“The 30th Anniversary of the Supercomputing Conference: Bringing the Future Closer—Supercomputing History and the Immortality of Now,”
Computer, vol. 51, issue 10, pp. 74–85, November 2018.
(1.73 MB)
“Netlib and NA-Net: Building a Scientific Computing Community,”
IEEE Annals of the History of Computing, vol. 30, no. 2, pp. 30-41, January 2008.
(352.71 KB)
“Revisiting Matrix Product on Master-Worker Platforms,”
International Journal of Foundations of Computer Science (IJFCS) (accepted), 00 2007.
(248.66 KB)
“Numerical Libraries and Tools for Scalable Parallel Cluster Computing,”
International Journal of High Performance Applications and Supercomputing, vol. 15, no. 2, pp. 175-180, January 2001.
(37.38 KB)
“The International Exascale Software Project Roadmap,”
International Journal of High Performance Computing, vol. 25, no. 1, pp. 3-60, January 2011.
(719.74 KB)
“Numerical Linear Algebra Algorithms and Software,”
Journal of Computational and Applied Mathematics, vol. 123, no. 1-2, pp. 489-514, October 1999.
(258.62 KB)
“Reducing the time to tune parallel dense linear algebra routines with partial execution and performance modelling,”
University of Tennessee Computer Science Technical Report, no. UT-CS-10-661, October 2010.
(287.87 KB)
“Recent Advances in Parallel Virtual Machine and Message Passing Interface,”
Lecture Notes in Computer Science, vol. 2840: Springer-Verlag, Berlin, January 2003.
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, CS-89-85, January 2008.
(6.42 MB)
“Report on the Sunway TaihuLight System,”
University of Tennessee Computer Science Technical Report, no. UT-EECS-16-742: University of Tennessee, June 2016.
“Report on the TianHe-2A System,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-04: University of Tennessee, September 2017.
(7.15 MB)
“An Iterative Solver Benchmark,”
Scientific Programming (to appear), 00 2002.
(142.67 KB)
“Self Adapting Numerical Algorithm for Next Generation Applications,”
International Journal of High Performance Computing Applications, vol. 17, no. 2, pp. 125-132, January 2003.
(479.18 KB)
“Translational Process: Mathematical Software Perspective,”
Journal of Computational Science, September 2020.
(752.59 KB)
“How Elegant Code Evolves With Hardware: The Case Of Gaussian Elimination,”
in Beautiful Code Leading Programmers Explain How They Think (Chapter 14), pp. 243-282, January 2008.
(257 KB)
“HPC Challenge: Design, History, and Implementation Highlights,”
Contemporary High Performance Computing: From Petascale Toward Exascale, Boca Raton, FL, Taylor and Francis, 2013.
(790.01 KB)
“The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems,”
International Conference on Computational Science (ICCS 2017), Zürich, Switzerland, Elsevier, June 2017.
(446.14 KB)
“Trends in High Performance Computing,”
The Computer Journal, vol. 47, no. 4: The British Computer Society, pp. 399-403, 00 2004.
(455.96 KB)
“Revisiting the Double Checkpointing Algorithm,”
University of Tennessee Computer Science Technical Report (LAWN 274), no. ut-cs-13-705, January 2013.
(682.22 KB)
“PULSAR Users’ Guide, Parallel Ultra-Light Systolic Array Runtime,”
University of Tennessee EECS Technical Report, no. UT-EECS-14-733: University of Tennessee, November 2014.
(561.56 KB)
“Performance Application Programming Interface for Extreme-Scale Environments (PAPI-EX) (Poster)
, Seattle, WA, 2020 NSF Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Principal Investigator Meeting, 20 2020.
(2.53 MB)
Exploiting Fine-Grain Parallelism in Recursive LU Factorization,”
Proceedings of PARCO'11, no. ICL-UT-11-04, Gent, Belgium, April 2011.
“Report on the Oak Ridge National Laboratory's Frontier System,”
ICL Technical Report, no. ICL-UT-22-05, May 2022.
(16.87 MB)
“High-Performance Conjugate-Gradient Benchmark: A New Metric for Ranking High-Performance Computing Systems,”
The International Journal of High Performance Computing Applications, 2015.
(336.19 KB)
“Recursive approach in sparse matrix LU factorization,”
Proceedings of 1st SGI Users Conference, Cracow, Poland (ACC Cyfronet UMM, 2000), pp. 409-418, January 2000.
(176.14 KB)
“Race to Exascale,”
Computing in Science and Engineering, vol. 21, issue 1, pp. 4-5, March 2019.
(106.97 KB)
“Performance Instrumentation and Measurement for Terascale Systems,”
ICCS 2003 Terascale Workshop, Melbourne, Australia, Springer, Berlin, Heidelberg, June 2003.
(5.36 MB)
“An Asynchronous Algorithm on NetSolve Global Computing System,”
Future Generation Computer Systems, vol. 22, issue 3, pp. 279-290, February 2006.
(568.92 KB)
“Autotuning Numerical Dense Linear Algebra for Batched Computation With GPU Hardware Accelerators,”
Proceedings of the IEEE, vol. 106, issue 11, pp. 2040–2055, November 2018.
(2.53 MB)
“High Performance Computing Systems: Status and Outlook,”
Acta Numerica, vol. 21, Cambridge, UK, Cambridge University Press, pp. 379-474, May 2012.
(1.48 MB)
“Batched BLAS (Basic Linear Algebra Subprograms) 2018 Specification
, July 2018.
(483.05 KB)
Model-Driven One-Sided Factorizations on Multicore, Accelerated Systems,”
Supercomputing Frontiers and Innovations, vol. 1, issue 1, 2014.
(1.86 MB)
“Exploring New Architectures in Accelerating CFD for Air Force Applications,”
Proceedings of the DoD HPCMP User Group Conference, Seattle, Washington, January 2008.
(492.86 KB)
“Network-Enabled Solvers: A Step Toward Grid-Based Computing,”
SIAM News, vol. 34, no. 10, December 2001.
“Dense Linear Algebra on Accelerated Multicore Hardware,”
High Performance Scientific Computing: Algorithms and Applications, London, UK, Springer-Verlag, 00 2012.
“The HPL Benchmark: Past, Present & Future
, ISC High Performance, Frankfurt, Germany, July 2016.
(3.41 MB)
Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, no. CS-89-85, January 2001.
(6.42 MB)
“High Performance Computing Trends and Self Adapting Numerial Software,”
Lecture Notes in Computer Science, High Performance Computing, 5th International Symposium ISHPC, vol. 2858, Tokyo-Odaiba, Japan, Springer-Verlag, Heidelberg, pp. 1-9, January 2003.
“High-Performance Computing,”
The Princeton Companion to Applied Mathematics, Princeton, New Jersey, Princeton University Press, pp. 839-842, 2015.
“Bi-objective Scheduling Algorithms for Optimizing Makespan and Reliability on Heterogeneous Systems,”
19th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA) (submitted), San Diego, CA, June 2007.
(223.82 KB)
“Biannual Top-500 Computer Lists Track Changing Environments for Scientific Computing,”
SIAM News, vol. 34, no. 9, October 2002.
(2.62 MB)
“A New Recursive Implementation of Sparse Cholesky Factorization,”
Proceedings of 16th IMACS World Congress 2000 on Scientific Computing, Applications Mathematics and Simulation, Lausanne, Switzerland, August 2000.
“International Exascale Software Project Roadmap v1.0,”
University of Tennessee Computer Science Technical Report, UT-CS-10-654, May 2010.
(719.74 KB)
“Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi,”
PPAM 2013, Warsaw, Poland, September 2013.
(284.97 KB)
“Performance and Reliability Trade-offs for the Double Checkpointing Algorithm,”
International Journal of Networking and Computing, vol. 4, no. 1, pp. 32-41.
(859.04 KB)
“Translational process: Mathematical software perspective,”
Journal of Computational Science, vol. 52, pp. 101216, 2021.
“Message Passing Software Systems,”
Encyclopedia of Electrical and Engineering, Supplement 1: John Wiley & Sons, Inc., 00 2000.
(289.38 KB)
“The evolution of mathematical software,”
Communications of the ACM, vol. 65227, issue 12, pp. 66 - 72, December 2022.
“JLAPACK - Compiling LAPACK Fortran to Java,”
Scientific Programming, vol. 7, no. 2, pp. 111-138, October 2002.
(307.46 KB)
“