Publications
Biannual Top-500 Computer Lists Track Changing Environments for Scientific Computing,”
SIAM News, vol. 34, no. 9, October 2002.
(2.62 MB)
“A New Recursive Implementation of Sparse Cholesky Factorization,”
Proceedings of 16th IMACS World Congress 2000 on Scientific Computing, Applications Mathematics and Simulation, Lausanne, Switzerland, August 2000.
“Trends in High Performance Computing,”
The Computer Journal, vol. 47, no. 4: The British Computer Society, pp. 399-403, 00 2004.
(455.96 KB)
“Bi-objective Scheduling Algorithms for Optimizing Makespan and Reliability on Heterogeneous Systems,”
19th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA) (submitted), San Diego, CA, June 2007.
(223.82 KB)
“Dense Linear Algebra on Accelerated Multicore Hardware,”
High Performance Scientific Computing: Algorithms and Applications, London, UK, Springer-Verlag, 00 2012.
“Performance Application Programming Interface for Extreme-Scale Environments (PAPI-EX) (Poster)
, Seattle, WA, 2020 NSF Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Principal Investigator Meeting, 20 2020.
(2.53 MB)
Message Passing Software Systems,”
Encyclopedia of Electrical and Engineering, Supplement 1: John Wiley & Sons, Inc., 00 2000.
(289.38 KB)
“Netlib and NA-Net: building a scientific computing community,”
In IEEE Annals of the History of Computing (to appear), August 2007.
(352.71 KB)
“Race to Exascale,”
Computing in Science and Engineering, vol. 21, issue 1, pp. 4-5, March 2019.
(106.97 KB)
“Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi,”
PPAM 2013, Warsaw, Poland, September 2013.
(284.97 KB)
“Performance of Various Computers Using Standard Linear Equations Software, (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, no. CS-89-85: University of Tennessee, June 2014.
(514.64 KB)
“The LINPACK Benchmark: Past, Present, and Future,”
Concurrency: Practice and Experience, vol. 15, pp. 803-820, 00 2008.
(94.86 KB)
“Top500 Supercomputer Sites (14th edition),”
University of Tennessee Computer Science Department Technical Report, no. UT-CS-99-434, November 1999.
(281.81 KB)
“Self-adapting Numerical Software for Next Generation Applications (LAPACK Working Note 157),”
ICL Technical Report, no. ICL-UT-02-07, 00 2002.
(475.94 KB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Department Technical Report, no. CS-89-85, January 2000.
(354.1 KB)
“An Asynchronous Algorithm on NetSolve Global Computing System,”
Future Generation Computer Systems, vol. 22, issue 3, pp. 279-290, February 2006.
(568.92 KB)
“International Exascale Software Project Roadmap v1.0,”
University of Tennessee Computer Science Technical Report, UT-CS-10-654, May 2010.
(719.74 KB)
“POMPEI: Programming with OpenMP4 for Exascale Investigations,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-09: University of Tennessee, December 2017.
(1.1 MB)
“Network-Enabled Solvers: A Step Toward Grid-Based Computing,”
SIAM News, vol. 34, no. 10, December 2001.
“The Problem with the Linpack Benchmark Matrix Generator,”
University of Tennessee Computer Science Technical Report, UT-CS-08-621 (also LAPACK Working Note 206), June 2008.
(136.41 KB)
“PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP,”
ACM Transactions on Mathematical Software, vol. 45, issue 2, June 2019.
(7.5 MB)
“High Performance Computing Trends and Self Adapting Numerial Software,”
Lecture Notes in Computer Science, High Performance Computing, 5th International Symposium ISHPC, vol. 2858, Tokyo-Odaiba, Japan, Springer-Verlag, Heidelberg, pp. 1-9, January 2003.
“The International Exascale Software Project: A Call to Cooperative Action by the Global High Performance Community,”
International Journal of High Performance Computing Applications (to appear), July 2009.
(203.04 KB)
“High Performance Development for High End Computing with Python Language Wrapper (PLW),”
International Journal for High Performance Computer Applications, vol. 21, no. 3, pp. 360-369, 00 2007.
(179.32 KB)
“HPCG Benchmark: a New Metric for Ranking High Performance Computing Systems,”
University of Tennessee Computer Science Technical Report , no. ut-eecs-15-736: University of Tennessee, January 2015.
“Disaster Survival Guide in Petascale Computing: An Algorithmic Approach,”
in Petascale Computing: Algorithms and Applications (to appear): Chapman & Hall - CRC Press, 00 2007.
(260.18 KB)
“Translational process: Mathematical software perspective,”
Journal of Computational Science, vol. 52, pp. 101216, 2021.
“High Performance Computing Today,”
FOMMS 2000: Foundations of Molecular Modeling and Simulation Conference (to appear), January 2000.
(66 KB)
“A Tribute to Gene Golub,”
Computing in Science and Engineering: IEEE, pp. 5, January 2008.
“Recursive Approach in Sparse Matrix LU Factorization,”
Scientific Programming, vol. 9, no. 1, pp. 51-60, 00 2001.
(217.16 KB)
“Introduction to the HPCChallenge Benchmark Suite,”
ICL Technical Report, no. ICL-UT-05-01, January 2005.
(124.86 KB)
“An Introduction to the MAGMA project - Acceleration of Dense Linear Algebra
: NVIDIA Webinar, June 2010.
Hierarchical QR Factorization Algorithms for Multi-core Cluster Systems,”
Parallel Computing, vol. 39, issue 4-5, pp. 212-232, May 2013.
(1.43 MB)
“Energy Footprint of Advanced Dense Numerical Linear Algebra using Tile Algorithms on Multicore Architecture,”
The 2nd International Conference on Cloud and Green Computing (submitted), Xiangtan, Hunan, China, November 2012.
(329.5 KB)
“HPCS Library Study Effort,”
University of Tennessee Computer Science Technical Report, UT-CS-08-617, January 2008.
(73.22 KB)
“Experiences and Lessons Learned with a Portable Interface to Hardware Performance Counters,”
PADTAD Workshop, IPDPS 2003, Nice, France, IEEE, April 2003.
(432.57 KB)
“Top500 Supercomputer Sites (13th edition),”
University of Tennessee Computer Science Department Technical Report, no. UT-CS-99-425, June 1999.
(278.51 KB)
“Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting,”
Concurrency and Computation: Practice and Experience, vol. 26, issue 7, pp. 1408-1431, May 2014.
(1.96 MB)
“Translational Process: Mathematical Software Perspective,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-11, August 2020.
(752.59 KB)
“The Quest for Petascale Computing,”
Computing in Science and Engineering, vol. 3, no. 3, pp. 32-39, May 2001.
(178.3 KB)
“Self-Adapting Numerical Software and Automatic Tuning of Heuristics,”
Lecture Notes in Computer Science, vol. 2660, Melbourne, Australia, Springer Verlag, pp. 759-770, June 2003.
(45.95 KB)
“Remembering Ken Kennedy,”
SciDAC Review, vol. 5, no. 2007, 00 2007.
(519.68 KB)
“Optimized Batched Linear Algebra for Modern Architectures,”
Euro-Par 2017, Santiago de Compostela, Spain, Springer, August 2017.
(618.33 KB)
“HPC Challenge: Design, History, and Implementation Highlights,”
On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing (to appear): Chapman & Hall/CRC Press, 00 2012.
(469.92 KB)
“Iterative Solver Benchmark (LAPACK Working Note 152),”
Scientific Programming, vol. 9, no. 4, pp. 223-231, 00 2001.
(168.05 KB)
“Hierarchical QR Factorization Algorithms for Multi-Core Cluster Systems,”
University of Tennessee Computer Science Technical Report (also Lawn 257), no. UT-CS-11-684, October 2011.
(405.71 KB)
“MAGMA MIC: Linear Algebra Library for Intel Xeon Phi Coprocessors
, Salt Lake City, UT, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC12), November 2012.
(6.4 MB)
Fault Tolerance Techniques for High-performance Computing,”
University of Tennessee Computer Science Technical Report (also LAWN 289), no. UT-EECS-15-734: University of Tennessee, May 2015.
“How Elegant Code Evolves With Hardware: The Case Of Gaussian Elimination,”
in Beautiful Code Leading Programmers Explain How They Think: O'Reilly Media, Inc., June 2007.
(257 KB)
“Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices,”
International Conference on Computational Science (ICCS 2017), Zurich, Switzerland, Procedia Computer Science, June 2017.
(364.95 KB)
“