Publications

Danalis, A., and H. Jagode, “Performance Application Programming Interface,” Accelerated Computing with HIP: Sun, Baruah and Kaeli, December 2022.

Danalis, A., H. Jagode, D. Barry, and J. Dongarra, Understanding Native Event Semantics , Knoxville, TN, 9th JLESC Workshop, April 2019.

(2.33 MB)

Davis, J., T. Gao, S. Chandrasekaran, H. Jagode, A. Danalis, P. Balaji, J. Dongarra, and M. Taufer, “Characterization of Power Usage and Performance in Data-Intensive Applications using MapReduce over MPI,” 2019 International Conference on Parallel Computing (ParCo2019), Prague, Czech Republic, September 2019.

Demmel, J., J. Dongarra, A. Fox, S. Williams, V. Volkov, and K. Yelick, “Accelerating Time-To-Solution for Computational Science and Engineering,” SciDAC Review, 00 2009.

(739.11 KB)

Demmel, J., J. Dongarra, B.. Parlett, W. Kahan, M. Gu, D. Bindel, Y. Hida, X. Li, O. Marques, J. E. Riedy, et al., “Prospectus for the Next LAPACK and ScaLAPACK Libraries,” PARA 2006, Umea, Sweden, June 2006.

(460.11 KB)

Demmel, J., and J. Dongarra, LAPACK 2005 Prospectus: Reliable and Scalable Software for Linear Algebra Computations on High End Computers : LAPACK Working Note 164, January 2005.

(172.59 KB)

Demmel, J., J. Dongarra, V. Eijkhout, E. Fuentes, A. Petitet, R. Vuduc, C. Whaley, and K. Yelick, “Self Adapting Linear Algebra Algorithms and Software,” IEEE Proceedings (to appear), 00 2004.

(587.67 KB)

Demmel, J., J. Dongarra, J. Langou, J. Langou, P. Luszczek, and M. Mahoney, “Prospectus for the Next LAPACK and ScaLAPACK Libraries: Basic ALgebra LIbraries for Sustainable Technology with Interdisciplinary Collaboration (BALLISTIC),” LAPACK Working Notes, no. 297, ICL-UT-20-07: University of Tennessee.

(1.41 MB)

Dempsey, B., and D. Weiss, “Towards An Efficient, Scalable Replication Mechanism for the I2-DSI Project,” University of North Carolina School of Library and Information Science Technical Report, no. TR-1999-01, January 1999.

Deshmukh, S., R. Yokota, and G. Bosilca, “Cache Optimization and Performance Modeling of Batched, Small, and Rectangular Matrix Multiplication on Intel, AMD, and Fujitsu Processors,” ACM Transactions on Mathematical Software, vol. 49, issue 3, pp. 1 - 29, September 2023.

Deshmukh, S., R. Yokota, G. Bosilca, and Q. Ma, “O(N) distributed direct factorization of structured dense matrices using runtime systems,” 52nd International Conference on Parallel Processing (ICPP 2023), Salt Lake City, Utah, ACM, August 2023.

Dewolfs, D., J. Broeckhove, V. Sunderam, and G. Fagg, “FT-MPI, Fault-Tolerant Metacomputing and Generic Name Services: A Case Study,” Lecture Notes in Computer Science, vol. 4192, no. ICL-UT-06-14: Springer Berlin / Heidelberg, pp. 133-140, 00 2006.

(362.44 KB)

Donfack, S., S. Tomov, and J. Dongarra, “Dynamically balanced synchronization-avoiding LU factorization with multicore and GPUs,” Fourth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2014, May 2014.

(490.08 KB)

Donfack, S., J. Dongarra, M. Faverge, M. Gates, J. Kurzak, P. Luszczek, and I. Yamazaki, “On Algorithmic Variants of Parallel Gaussian Elimination: Comparison of Implementations in Terms of Performance and Numerical Properties,” University of Tennessee Computer Science Technical Report, no. UT-CS-13-715, July 2013, 2012.

(358.98 KB)

Donfack, S., S. Tomov, and J. Dongarra, “Performance evaluation of LU factorization through hardware counter measurements,” University of Tennessee Computer Science Technical Report, no. ut-cs-12-700, October 2012.

(794.82 KB)

Donfack, S., J. Dongarra, M. Faverge, M. Gates, J. Kurzak, P. Luszczek, and I. Yamazaki, “A Survey of Recent Developments in Parallel Implementations of Gaussian Elimination,” Concurrency and Computation: Practice and Experience, vol. 27, issue 5, pp. 1292-1309, April 2015.

(783.45 KB)

Donfack, S., S. Tomov, and J. Dongarra, “Dynamically balanced synchronization-avoiding LU factorization with multicore and GPUs,” University of Tennessee Computer Science Technical Report, no. ut-cs-13-713, July 2013.

(659.77 KB)

Dong, T., A. Haidar, P. Luszczek, S. Tomov, A. Abdelfattah, and J. Dongarra, “MAGMA Batched: A Batched BLAS Approach for Small Matrix Factorizations and Applications on GPUs,” Innovative Computing Laboratory Technical Report, no. ICL-UT-16-02: University of Tennessee, August 2016.

(929.79 KB)

Dong, T., V. Dobrev, T. Kolev, R. Rieben, S. Tomov, and J. Dongarra, “A Step towards Energy Efficient Computing: Redesigning A Hydrodynamic Application on CPU-GPU,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014.

(1.01 MB)

Dong, T., A. Haidar, S. Tomov, and J. Dongarra, “Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices,” International Conference on Computational Science (ICCS 2017), Zurich, Switzerland, Procedia Computer Science, June 2017.

(364.95 KB)

Dong, T., A. Haidar, S. Tomov, and J. Dongarra, “A Fast Batched Cholesky Factorization on a GPU,” International Conference on Parallel Processing (ICPP-2014), Minneapolis, MN, September 2014.

(1.37 MB)

Dong, T., V. Dobrev, T. Kolev, R. Rieben, S. Tomov, and J. Dongarra, “Hydrodynamic Computation with Hybrid Programming on CPU-GPU Clusters,” University of Tennessee Computer Science Technical Report, no. ut-cs-13-714, July 2013.

(866.68 KB)

Dong, T., T. Kolev, R. Rieben, V. Dobrev, S. Tomov, and J. Dongarra, “Acceleration of the BLAST Hydro Code on GPU,” Supercomputing '12 (poster), Salt Lake City, Utah, SC12, November 2012.

Dong, T., A. Haidar, S. Tomov, and J. Dongarra, “Accelerating the SVD Bi-Diagonalization of a Batch of Small Matrices using GPUs,” Journal of Computational Science, vol. 26, pp. 237–245, May 2018.

(2.18 MB)

Dong, T., A. Haidar, P. Luszczek, J. Harris, S. Tomov, and J. Dongarra, “LU Factorization of Small Matrices: Accelerating Batched DGETRF on the GPU,” 16th IEEE International Conference on High Performance Computing and Communications (HPCC), Paris, France, IEEE, August 2014.

(684.73 KB)

Dongarra, J., M. A. Heroux, and P. Luszczek, “HPCG Benchmark: a New Metric for Ranking High Performance Computing Systems,” University of Tennessee Computer Science Technical Report , no. ut-eecs-15-736: University of Tennessee, January 2015.

Dongarra, J., S. Moore, P. Mucci, K. Seymour, and H. You, “Accurate Cache and TLB Characterization Using Hardware Counters,” International Conference on Computational Science (ICCS 2004), Krakow, Poland, Springer, June 2004.

(167.1 KB)

Dongarra, J., H. Meuer, and E. Strohmaier, “Top500 Supercomputer Sites (15th edition),” University of Tennessee Computer Science Department Technical Report, no. UT-CS-00-442, June 2000.

(278.88 KB)

Dongarra, J., and P. Luszczek, “How Elegant Code Evolves With Hardware: The Case Of Gaussian Elimination,” in Beautiful Code Leading Programmers Explain How They Think: O'Reilly Media, Inc., June 2007.

(257 KB)

Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, P. Wu, I. Yamazaki, A. YarKhan, M. Abalenkovs, N. Bagherpour, et al., “PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP,” ACM Transactions on Mathematical Software, vol. 45, issue 2, June 2019.

(7.5 MB)

Dongarra, J., A. Haidar, O. Hernandez, S. Tomov, and M G. Venkata, “POMPEI: Programming with OpenMP4 for Exascale Investigations,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-09: University of Tennessee, December 2017.

(1.1 MB)

Dongarra, J., “Performance of Various Computers Using Standard Linear Equations Software,” University of Tennessee Computer Science Technical Report, no. cs-89-85, February 2013.

(539.24 KB)

Dongarra, J., M. Gates, P. Luszczek, and S. Tomov, “Translational Process: Mathematical Software Perspective,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-11, August 2020.

(752.59 KB)

Dongarra, J., and V. Eijkhout, “Numerical Linear Algebra,” Encyclopedia of Computer Science and Technology, eds. Kent, A., Williams, J., vol. 41, pp. 207-233, August 1999.

(262 KB)

Dongarra, J., M. Faverge, T. Herault, J. Langou, and Y. Robert, “Hierarchical QR Factorization Algorithms for Multi-Core Cluster Systems,” IPDPS 2012, the 26th IEEE International Parallel and Distributed Processing Symposium, Shanghai, China, IEEE Computer Society Press, May 2012.

(405.71 KB)

Dongarra, J., and A. Lastovetsky, “An Overview of Heterogeneous High Performance and Grid Computing,” Engineering the Grid (to appear): Nova Science Publishers, Inc., 00 2004.

(199.93 KB)

Dongarra, J., and P. Beckman, “International Exascale Software Project Roadmap v1.0,” University of Tennessee Computer Science Technical Report, UT-CS-10-654, May 2010.

(719.74 KB)

Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, “Accelerating Numerical Dense Linear Algebra Calculations with GPUs,” Numerical Computations with GPUs: Springer International Publishing, pp. 3-28, 2014.

(1.06 MB)

Dongarra, J., K. London, S. Moore, P. Mucci, and D. Terpstra, “Using PAPI for Hardware Performance Monitoring on Linux Systems,” Conference on Linux Clusters: The HPC Revolution, Urbana, Illinois, Linux Clusters Institute, June 2001.

(422.35 KB)

Dongarra, J., “A Not So Simple Matter of Software,” NCSA Access Online: NCSA, 00 2005.

(457.69 KB)

Dongarra, J., H. Meuer, H. D. Simon, and E. Strohmaier, “High Performance Computing Trends,” HERMIS, vol. 2, pp. 155-163, November 2001.

Dongarra, J., and A. Geist, “Report on the Oak Ridge National Laboratory's Frontier System,” ICL Technical Report, no. ICL-UT-22-05, May 2022.

(16.87 MB)

Dongarra, J., T. Herault, and Y. Robert, “Revisiting the Double Checkpointing Algorithm,” 15th Workshop on Advances in Parallel and Distributed Computational Models, at the IEEE International Parallel & Distributed Processing Symposium, Boston, MA, May 2013.

(591.1 KB)

Dongarra, J., “High Performance Computing Trends, Supercomputers, Clusters, and Grids,” Information Processing Society of Japan Symposium Series, vol. 2003, no. 14, pp. 55-58, January 2003.

Dongarra, J., S. Gottlieb, and W. T. Kramer, “Race to Exascale,” Computing in Science and Engineering, vol. 21, issue 1, pp. 4-5, March 2019.

(106.97 KB)

Dongarra, J., R. Graybill, W. Harrod, R. Lucas, E. Lusk, P. Luszczek, J. McMahon, A. Snavely, J. Vetter, K. Yelick, et al., “DARPA's HPCS Program: History, Models, Tools, Languages,” in Advances in Computers, vol. 72: Elsevier, January 2008.

(3.61 MB)

Dongarra, J., T. Herault, and Y. Robert, “Fault Tolerance Techniques for High-performance Computing,” University of Tennessee Computer Science Technical Report (also LAWN 289), no. UT-EECS-15-734: University of Tennessee, May 2015.

Dongarra, J., P. Kacsuk, and N.. Podhorszki, “Recent Advances in Parallel Virtual Machine and Message Passing Interface,” Lecture Notes in Computer Science: Proceedings of 7th European PVM/MPI Users' Group Meeting 2000, (Hungary: Springer Verlag), pp. V1908, January 2000.

Dongarra, J., and V. Eijkhout, “Finite-choice Algorithm Optimization in Conjugate Gradients (LAPACK Working Note 159),” University of Tennessee Computer Science Technical Report, UT-CS-03-502, January 2003.

(64.52 KB)

Dongarra, J., S. Hammarling, N. J. Higham, S. Relton, and M. Zounon, “Optimized Batched Linear Algebra for Modern Architectures,” Euro-Par 2017, Santiago de Compostela, Spain, Springer, August 2017.

(618.33 KB)

Main menu

Pages