Publications

Reed, D., D. Gannon, and J. Dongarra, “HPC Forecast: Cloudy and Uncertain,” Communications of the ACM, vol. 66, issue 2, pp. 82 - 90, January 2023.

Haidar, A., J. Dongarra, K. Kabir, M. Gates, P. Luszczek, S. Tomov, and Y. Jia, “HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi,” Scientific Programming, vol. 23, issue 1, January 2015.

(553.94 KB)

Ltaeif, H., S. Tomov, R. Nath, and J. Dongarra, “Hybrid Multicore Cholesky Factorization with Multiple GPU Accelerators,” IEEE Transaction on Parallel and Distributed Systems (submitted), March 2010.

(3.75 MB)

Agullo, E., C. Augonnet, J. Dongarra, H. Ltaeif, R. Namyst, S. Thibault, and S. Tomov, “A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs,” in GPU Computing Gems, Jade Edition, vol. 2: Elsevier, pp. 473-484, 00 2011.

Ma, T., A. Bouteiller, G. Bosilca, and J. Dongarra, “Impact of Kernel-Assisted MPI Communication over Scientific Applications: CPMD and FFTW,” 18th EuroMPI, Santorini, Greece, Springer, pp. 247-254, September 2011.

Dongarra, J., D. Gannon, G. Fox, and K. Kennedy, “The Impact of Multicore on Computational Science Software,” CTWatch Quarterly, vol. 3, issue 1, February 2007.

Buttari, A., J. Dongarra, J. Kurzak, J. Langou, P. Luszczek, and S. Tomov, “The Impact of Multicore on Math Software,” PARA 2006, Umea, Sweden, June 2006.

(223.53 KB)

Kurzak, J., H. Anzt, M. Gates, and J. Dongarra, “Implementation and Tuning of Batched Cholesky Factorization and Solve for NVIDIA GPUs,” IEEE Transactions on Parallel and Distributed Systems, no. 1045-9219, November 2015.

Keller, R., G. Bosilca, G. Fagg, M. Resch, and J. Dongarra, “Implementation and Usage of the PERUSE-Interface in Open MPI,” Euro PVM/MPI 2006, Bonn, Germany, September 2006.

(310.76 KB)

Kurzak, J., and J. Dongarra, “Implementation of Mixed Precision in Solving Systems of Linear Equations on the Cell Processor,” Concurrency and Computation: Practice and Experience, vol. 19, no. 10, pp. 1371-1385, July 2007.

(453.78 KB)

Kurzak, J., and J. Dongarra, “Implementation of the Mixed-Precision High Performance LINPACK Benchmark on the CELL Processor,” University of Tennessee Computer Science Tech Report, no. UT-CS-06-580, LAPACK Working Note #177, September 2006.

(506.18 KB)

Kurzak, J., R. Nath, P. Du, and J. Dongarra, “An Implementation of the Tile QR Factorization for a GPU and Multiple CPUs,” Applied Parallel and Scientific Computing, vol. 7133, pp. 248-257, 00 2012.

(623.5 KB)

Yamazaki, I., D. Becker, J. Dongarra, A. Druinsky, I.. Peled, S. Toledo, G. Ballard, J. Demmel, and O. Schwartz, “Implementing a Blocked Aasen’s Algorithm with a Dynamic Scheduler on Multicore Architectures,” IPDPS 2013 (submitted), Boston, MA, 00 2013.

(1.22 MB)

Kurzak, J., and J. Dongarra, “Implementing Linear Algebra Routines on Multi-Core Processors with Pipelining and a Look Ahead,” University of Tennessee Computer Science Tech Report, UT-CS-06-581, LAPACK Working Note #178, January 2006.

(304.4 KB)

Nath, R., S. Tomov, and J. Dongarra, “An Improved MAGMA GEMM for Fermi GPUs,” International Journal of High Performance Computing, vol. 24, no. 4, pp. 511-515, 00 2010.

Jeannot, E., K. Seymour, A. YarKhan, and J. Dongarra, “Improved Runtime and Transfer Time Prediction Mechanisms in a Network Enabled Server,” Parallel Processing Letters, vol. 17, no. 1, pp. 47-59, March 2006.

(718.4 KB)

Jeannot, E., K. Seymour, A. YarKhan, and J. Dongarra, “Improved Runtime and Transfer Time Prediction Mechanisms in a Network Enabled Servers Middleware,” Parallel Processing Letters, vol. 17, no. 1, pp. 47-59, March 2007.

(718.4 KB)

Anzt, H., T. Huckle, J. Bräckle, and J. Dongarra, “Incomplete Sparse Approximate Inverses for Parallel Preconditioning,” Parallel Computing, vol. 71, pp. 1–22, January 2018.

(1.24 MB)

Arnold, D., H. Casanova, and J. Dongarra, “Innovations of the NetSolve Grid Computing System,” Concurrency: Practice and Experience, vol. 14, no. 13-15, pp. 1457-1479, January 2002.

(311.31 KB)

Hardt, M., K. Seymour, J. Dongarra, M. Zapf, and N. Ruiter, “Interactive Grid-Access Using Gridsolve and Giggle,” Computing and Informatics, vol. 27, no. 2, pp. 233-248,ISSN1335-9150, 00 2008.

(533.4 KB)

Dongarra, J., P. Beckman, P. Aerts, F. Cappello, T. Lippert, S. Matsuoka, P. Messina, T. Moore, R. Stevens, A. Trefethen, et al., “The International Exascale Software Project: A Call to Cooperative Action by the Global High Performance Community,” International Journal of High Performance Computing Applications (to appear), July 2009.

(203.04 KB)

Dongarra, J., P. Beckman, T. Moore, P. Aerts, G. Aloisio, J-C. Andre, D. Barkai, J-Y. Berthou, T. Boku, B. Braunschweig, et al., “The International Exascale Software Project Roadmap,” International Journal of High Performance Computing, vol. 25, no. 1, pp. 3-60, January 2011.

(719.74 KB)

Luszczek, P., J. Dongarra, D. Koester, R. Rabenseifner, B. Lucas, J. Kepner, J. McCalpin, D. Bailey, and D. Takahashi, Introduction to the HPC Challenge Benchmark Suite , March 2005.

(124.86 KB)

Haidar, A., H. Jagode, P. Vaccaro, A. YarKhan, S. Tomov, and J. Dongarra, “Investigating Power Capping toward Energy-Efficient Scientific Applications,” Concurrency Computation: Practice and Experience, vol. 2018, issue e4485, pp. 1-14, April 2018.

(1.2 MB)

Jagode, H., S. Moore, D. Terpstra, J. Dongarra, A. Knuepfer, M. Jurenz, M. S. Mueller, and W. E. Nagel, “I/O Performance Analysis for the Petascale Simulation Code FLASH,” ISC'09, Hamburg, Germany, June 2009.

(88.88 KB)

Dongarra, J., V. Eijkhout, and H. van der Vorst, “An Iterative Solver Benchmark,” Scientific Programming (to appear), 00 2002.

(142.67 KB)

Dongarra, J., V. Eijkhout, and H. van der Vorst, “Iterative Solver Benchmark (LAPACK Working Note 152),” Scientific Programming, vol. 9, no. 4, pp. 223-231, 00 2001.

(168.05 KB)

Doolin, D., J. Dongarra, and K. Seymour, “JLAPACK - Compiling LAPACK Fortran to Java,” Scientific Programming, vol. 7, no. 2, pp. 111-138, October 2002.

(307.46 KB)

Vetter, J., R. Glassbrook, J. Dongarra, K. Schwan, B. Loftis, S. McNally, J. Meredith, J. Rogers, P. Roth, K. Spafford, et al., “Keeneland: Bringing Heterogeneous GPU Computing to the Computational Science Community,” IEEE Computing in Science & Engineering, vol. 13, issue 5, pp. 90-95, August 2011.

(932.57 KB)

Ma, T., G. Bosilca, A. Bouteiller, and J. Dongarra, “Kernel-assisted and topology-aware MPI collective communications on multi-core/many-core platforms,” Journal of Parallel and Distributed Computing, vol. 73, issue 7, pp. 1000-1010, July 2013.

(1.4 MB)

Demmel, J., and J. Dongarra, LAPACK 2005 Prospectus: Reliable and Scalable Software for Linear Algebra Computations on High End Computers : LAPACK Working Note 164, January 2005.

(172.59 KB)

Anderson, E., Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, et al., “LAPACK Users' Guide, 3rd ed.,” Philadelphia: Society for Industrial and Applied Mathematics, January 1999.

Gustavson, F. G., J. Wasniewski, J. Dongarra, J. Herrero, and J. Langou, “Level-3 Cholesky Factorization Routines Improve Performance of Many Cholesky Algorithms,” ACM Transactions on Mathematical Software (TOMS), vol. 39, issue 2, February 2013.

(439.46 KB)

Gustavson, F. G., J. Wasniewski, and J. Dongarra, “Level-3 Cholesky Kernel Subroutine of a Fully Portable High Performance Minimal Storage Hybrid Format Cholesky Algorithm,” ACM TOMS (submitted), also LAPACK Working Note (LAWN) 211, 00 2010.

(190.2 KB)

Abdelfattah, A., H. Anzt, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, , and A. YarKhan, “Linear Algebra Software for Large-Scale Accelerated Multicore Computing,” Acta Numerica, vol. 25, pp. 1-160, May 2016.

Dongarra, J., P. Luszczek, and A. Petitet, “The LINPACK Benchmark: Past, Present, and Future,” Concurrency: Practice and Experience, vol. 15, pp. 803-820, 00 2008.

(94.86 KB)

Dongarra, J., “LINPACK on Future Manycore and GPu Based Systems,” PARA 2010, Reykjavik, Iceland, June 2010.

Anzt, H., T. Cojean, C. Yen-Chen, J. Dongarra, G. Flegar, P. Nayak, S. Tomov, Y. M. Tsai, and W. Wang, “Load-Balancing Sparse Matrix Vector Product Kernels on GPUs,” ACM Transactions on Parallel Computing, vol. 7, issue 1, March 2020.

(5.67 MB)

Beck, M., D. Arnold, A. Bassi, F. Berman, H. Casanova, J. Dongarra, T. Moore, G. Obertelli, J. Plank, M. Swany, et al., “Logistical Computing and Internetworking: Middleware for the Use of Storage in Communication,” submitted to SC2001, Denver, Colorado, November 2001.

(41.79 KB)

Beck, M., H. Casanova, J. Dongarra, T. Moore, J. Plank, F. Berman, and R. Wolski, “Logistical Quality of Service in NetSolve,” Computer Communications, vol. 22, no. 11, pp. 1034-1044, January 1999.

(168.39 KB)

Bell, G., D. Bailey, A. H. Karp, J. Dongarra, and K. Walsh, “A Look Back on 30 Years of the Gordon Bell Prize,” International Journal of High Performance Computing and Networking, vol. 31, issue 6, pp. 469–484, 2017.

Luszczek, P., J. Kurzak, and J. Dongarra, “Looking Back at Dense Linear Algebra Software,” Perspectives on Parallel and Distributed Processing: Looking Back and What's Ahead (to appear), 00 2012.

(235.91 KB)

Luszczek, P., J. Kurzak, and J. Dongarra, “Looking Back at Dense Linear Algebra Software,” Journal of Parallel and Distributed Computing, vol. 74, issue 7, pp. 2548–2560, July 2014.

(1.79 MB)

Agullo, E., C. Augonnet, J. Dongarra, M. Faverge, J. Langou, H. Ltaeif, and S. Tomov, “LU Factorization for Accelerator-Based Systems,” IEEE/ACS AICCSA 2011, Sharm-El-Sheikh, Egypt, December 2011.

(234.86 KB)

Kurzak, J., P. Luszczek, and J. Dongarra, “LU Factorization with Partial Pivoting for a Multicore System with Accelerators,” IEEE Transactions on Parallel and Distributed Computing, vol. 24, issue 8, pp. 1613-1621, August 2013.

(1.08 MB)

Farhan, M. Al, A. Abdelfattah, S. Tomov, M. Gates, D. Sukkari, A. Haidar, R. Rosenberg, and J. Dongarra, “MAGMA Templates for Scalable Linear Algebra on Emerging Architectures,” The International Journal of High Performance Computing Applications, vol. 34, issue 6, pp. 645-658, November 2020.

Strohmaier, E., J. Dongarra, H. Meuer, and H. D. Simon, “The Marketplace for High-Performance Computers,” Parallel Computing, vol. 25, no. 13-14, pp. 1517-1545, October 2002.

(285.78 KB)

Agullo, E., G. Bosilca, C. Castagnède, J. Dongarra, H. Ltaeif, and S. Tomov, “Matrices Over Runtime Systems at Exascale,” Supercomputing '12 (poster), Salt Lake City, Utah, November 2012.

Abdelfattah, A., S. Tomov, and J. Dongarra, “Matrix Multiplication on Batches of Small Matrices in Half and Half-Complex Precisions,” Journal of Parallel and Distributed Computing, vol. 145, pp. 188-201, November 2020.

(1.3 MB)

Main menu

Pages