Publications

2024

Slattery, S. A., K. A. Surjuse, C. Peterson, D. A. Penchoff, and E. Valeev, “Economical Quasi-Newton Unitary Optimization of Electronic Orbitals,” Physical Chemistry Chemical Physics, December 2023, 2024.

2023

Hoefler, T., B. Stevens, A. F. Prein, J. Baehr, T. Schulthess, T. F. Stocker, J. Taylor, D. Klocke, P. Manninen, P. M. Forster, et al., Earth Virtualization Engines - A Technical Perspective , September 2023.

Li, J., G. Bosilca, A. Bouteiller, and B. Nicolae, “Elastic deep learning through resilient collective operations,” SC-W 2023: Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023.

2022

Cao, Q., G. Bosilca, N. Losada, W. Wu, D. Zhong, and J. Dongarra, “Evaluating Data Redistribution in PaRSEC,” IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 8, pp. 1856-1872, August 2022.

(3.19 MB)

Penchoff, D. A., C. C. Peterson, E. M. Wrancher, G. Bosilca, R. J. Harrison, E. F. Valeev, and P. D. Benny, “Evaluations of molecular modeling and machine learning for predictive capabilities in binding of lanthanum and actinium with carboxylic acids,” Journal of Radioanalytical and Nuclear Chemistry, December 2022.

Dongarra, J., “The evolution of mathematical software,” Communications of the ACM, vol. 65227, issue 12, pp. 66 - 72, December 2022.

Fortenberry, A., and S. Tomov, “Extending MAGMA Portability with OneAPI,” The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC22), Ninth Workshop on Accelerator Programming Using Directives (WACCPD 2022), Dallas, TX, November 2022.

(999.19 KB)

Fortenberry, A., S. Tomov, and K. Wong, Extending MAGMA Portability with OneAPI , Dallas, TX, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC22), ACM Student Research Competition, November 2022.

(1.33 MB)

2021

Kolev, T., P. Fischer, M. Min, J. Dongarra, J. Brown, V. Dobrev, T. Warburton, S. Tomov, M. S. Shephard, A. Abdelfattah, et al., “Efficient exascale discretizations: High-order finite element methods,” The International Journal of High Performance Computing Applications, pp. 10943420211020803, 2021.

Barry, D., A. Danalis, and H. Jagode, “Effortless Monitoring of Arithmetic Intensity with PAPI’s Counter Analysis Toolkit,” Tools for High Performance Computing 2018/2019: Springer, pp. 195–218, 2021.

Gao, Y., G. Pallez, Y. Robert, and F. Vivien, “Evaluating Task Dropping Strategies for Overloaded Real-Time Systems (Work-In-Progress),” 42nd Real Time Systems Symposium (RTSS): IEEE Computer Society Press, 2021.

(217.13 KB)

Iqbal, Z., S. Nooshabadi, I. Yamazaki, S. Tomov, and J. Dongarra, “Exploiting Block Structures of KKT Matrices for Efficient Solution of Convex Optimization Problems,” IEEE Access, 2021.

(1.35 MB)

2020

Barry, D., A. Danalis, and H. Jagode, “Effortless Monitoring of Arithmetic Intensity with PAPI's Counter Analysis Toolkit,” 13th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Springer International Publishing, September 2020.

(738.47 KB)

Han, L., Y. Gao, J. Liu, Y. Robert, and F. Vivien, “Energy-Aware Strategies for Reliability-Oriented Real-Time Task Allocation on Heterogeneous Platforms,” 49th International Conference on Parallel Processing (ICPP 2020), Edmonton, AB, Canada, ACM Press, 2020.

(804.96 KB)

Nayak, P., T. Cojean, and H. Anzt, “Evaluating Asynchronous Schwarz Solvers on GPUs,” International Journal of High Performance Computing Applications, August 2020.

Anzt, H., Y. M. Tsai, A. Abdelfattah, T. Cojean, and J. Dongarra, “Evaluating the Performance of NVIDIA’s A100 Ampere GPU for Sparse and Batched Computations,” 2020 IEEE/ACM Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS): IEEE, November 2020.

(1.9 MB)

Jagode, H., A. Danalis, and J. Dongarra, Exa-PAPI: The Exascale Performance API with Modern C++ , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.

(556.78 KB)

Cao, Q., Y. Pei, K. Akbudak, A. Mikhalev, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, “Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications,” Platform for Advanced Scientific Computing Conference (PASC20), Geneva, Switzerland, ACM, June 2020.

(2.71 MB)

2019

YarKhan, A., J. Kurzak, A. Abdelfattah, and J. Dongarra, “An Empirical View of SLATE Algorithms on Scalable Hybrid System,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-08: University of Tennessee, Knoxville, September 2019.

(441.16 KB)

Lopez, M. G., W. Joubert, V. Larrea, O. Hernandez, A. Haidar, S. Tomov, and J. Dongarra, “Evaluation of Directive-Based Performance Portable Programming Models,” International Journal of High Performance Computing and Networking, vol. 14, issue 2, pp. 165-182.

(1.12 MB)

Pei, Y., G. Bosilca, I. Yamazaki, A. Ida, and J. Dongarra, “Evaluation of Programming Models to Address Load Imbalance on Distributed Multi-Core CPUs: A Case Study with Block Low-Rank Factorization,” PAW-ATM Workshop at SC19, Denver, CO, ACM, November 2019.

(4.51 MB)

2018

Bouteiller, A., S. Pophale, S. Boehm, M. B. Baker, and M G. Venkata, “Evaluating Contexts in OpenSHMEM-X Reference Implementation,” OpenSHMEM and Related Technologies. Big Compute and Big Data Convergence, Cham, Springer International Publishing, pp. 50–62, 2018.

Tomov, S., A. Haidar, D. Schultz, and J. Dongarra, “Evaluation and Design of FFT for Distributed Accelerated Systems,” ECP WBS 2.3.3.09 Milestone Report, no. FFT-ECP ST-MS-10-1216: Innovative Computing Laboratory, University of Tennessee, October 2018.

(7.53 MB)

Jagode, H., A. Danalis, R. Hoque, M. Faverge, and J. Dongarra, “Evaluation of Dataflow Programming Models for Electronic Structure Theory,” Concurrency and Computation: Practice and Experience: Special Issue on Parallel and Distributed Algorithms, vol. 2018, issue e4490, pp. 1–20, May 2018.

(1.69 MB)

2017

Zhao, Y., L. Wan, W. Wu, G. Bosilca, R. Vuduc, J. Ye, W. Tang, and Z. Xu, “Efficient Communications in Training Large Scale Neural Networks,” ACM MultiMedia Workshop 2017, Mountain View, CA, ACM, October 2017.

(1.41 MB)

2016

Anzt, H., J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, “Efficiency of General Krylov Methods on GPUs – An Experimental Study,” 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 683-691, May 2016.

Anzt, H., J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, “Efficiency of General Krylov Methods on GPUs – An Experimental Study,” The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), Chicago, IL, IEEE, May 2016.

(285.28 KB)

2015

Benoit, A., S. K. Raina, and Y. Robert, “Efficient Checkpoint/Verification Patterns,” International Journal on High Performance Computing Applications, July 2015.

(392.76 KB)

Haidar, A., P. Luszczek, S. Tomov, and J. Dongarra, “Efficient Eigensolver Algorithms on Accelerator Based Architectures,” 2015 SIAM Conference on Applied Linear Algebra (SIAM LA), Atlanta, GA, SIAM, October 2015.

(6.98 MB)

Solcà, R., A. Kozhevnikov, A. Haidar, S. Tomov, T. C. Schulthess, and J. Dongarra, “Efficient Implementation Of Quantum Materials Simulations On Distributed CPU-GPU Systems,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.

(1.09 MB)

Anzt, H., S. Tomov, and J. Dongarra, “Energy Efficiency and Performance Frontiers for Sparse Computations on GPU Supercomputers,” Sixth International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM '15), San Francisco, CA, ACM, February 2015.

(2.29 MB)

Reed, D., and J. Dongarra, “ Exascale Computing and Big Data,” Communications of the ACM, vol. 58, no. 7: ACM, pp. 56-68, July 2015.

(7.3 MB)

Anzt, H., B. Haugen, J. Kurzak, P. Luszczek, and J. Dongarra, “Experiences in Autotuning Matrix Multiplication for Energy Minimization on GPUs,” Concurrency and Computation: Practice and Experience, vol. 27, issue 17, pp. 5096 - 5113, Oct 12, 2015.

(1.99 MB)

Anzt, H., B. Haugen, J. Kurzak, P. Luszczek, and J. Dongarra, “Experiences in autotuning matrix multiplication for energy minimization on GPUs,” Concurrency in Computation: Practice and Experience, vol. 27, issue 17, pp. 5096-5113, December 2015.

(1.98 MB)

2014

Benoit, A., Y. Robert, and S. K. Raina, “Efficient checkpoint/verification patterns for silent error detection,” Innovative Computing Laboratory Technical Report, no. ICL-UT-14-03: University of Tennessee, May 2014.

(397.75 KB)

Baboulin, M., D. Becker, G. Bosilca, A. Danalis, and J. Dongarra, “An Efficient Distributed Randomized Algorithm for Solving Large Dense Symmetric Indefinite Linear Systems,” Parallel Computing, vol. 40, issue 7, pp. 213-223, July 2014.

(1.42 MB)

2013

Turchenko, V., G. Bosilca, A. Bouteiller, and J. Dongarra, “Efficient Parallelization of Batch Pattern Training Algorithm on Many-core and Cluster Architectures,” 7th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems, Berlin, Germany, September 2013.

(102.51 KB)

Li, Y., A. YarKhan, J. Dongarra, K. Seymour, and A. Hurault, “Enabling Workflows in GridSolve: Request Sequencing and Service Trading,” Journal of Supercomputing, vol. 64, issue 3, pp. 1133-1152, June 2013.

(821.29 KB)

Bland, W., A. Bouteiller, T. Herault, J. Hursey, G. Bosilca, and J. Dongarra, “An evaluation of User-Level Failure Mitigation support in MPI,” Computing, vol. 95, issue 12, pp. 1171-1184, December 2013.

(311.23 KB)

Bland, W., P. Du, A. Bouteiller, T. Herault, G. Bosilca, and J. Dongarra, “Extending the scope of the Checkpoint-on-Failure protocol for forward recovery in standard MPI,” Concurrency and Computation: Practice and Experience, July 2013.

(3.89 MB)

2012

Baboulin, M., D. Becker, G. Bosilca, A. Danalis, and J. Dongarra, “An efficient distributed randomized solver with application to large dense linear systems,” ICL Technical Report, no. ICL-UT-12-02, July 2012.

(626.26 KB)

Song, F., S. Tomov, and J. Dongarra, “Enabling and Scaling Matrix Computations on Heterogeneous Multi-Core and Multi-GPU Systems,” 26th ACM International Conference on Supercomputing (ICS 2012), San Servolo Island, Venice, Italy, ACM, June 2012.

(5.88 MB)

Bland, W., “Enabling Application Resilience With and Without the MPI Standard,” 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Ottawa, Canada, May 2012.

(262.93 KB)

Dongarra, J., H. Ltaeif, P. Luszczek, and V. M. Weaver, “Energy Footprint of Advanced Dense Numerical Linear Algebra using Tile Algorithms on Multicore Architecture,” The 2nd International Conference on Cloud and Green Computing (submitted), Xiangtan, Hunan, China, November 2012.

(329.5 KB)

Ltaeif, H., P. Luszczek, and J. Dongarra, “Enhancing Parallelism of Tile Bidiagonal Transformation on Multicore Architectures using Tree Reduction,” Lecture Notes in Computer Science, vol. 7203, pp. 661-670, September 2012.

(185.77 KB)

Bland, W., A. Bouteiller, T. Herault, J. Hursey, G. Bosilca, and J. Dongarra, “An Evaluation of User-Level Failure Mitigation Support in MPI,” Proceedings of Recent Advances in Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI 2012, Vienna, Austria, Springer, September 2012.

Bland, W., P. Du, A. Bouteiller, T. Herault, G. Bosilca, and J. Dongarra, “Extending the Scope of the Checkpoint-on-Failure Protocol for Forward Recovery in Standard MPI,” University of Tennessee Computer Science Technical Report, no. ut-cs-12-702, 00 2012.

(422.76 KB)

2011

Song, F., S. Tomov, and J. Dongarra, “Efficient Support for Matrix Computations on Heterogeneous Multi-core and Multi-GPU Architectures,” University of Tennessee Computer Science Technical Report, UT-CS-11-668, (also Lawn 250), June 2011.

(5.93 MB)

Lively, C., X. Wu, V. Taylor, S. Moore, H-C. Chang, and K. Cameron, “Energy and performance characteristics of different parallel implementations of scientific applications on multicore systems,” International Journal of High Performance Computing Applications, vol. 25, no. 3, pp. 342-350, 00 2011.

(467.18 KB)

Luszczek, P., E. Meek, S. Moore, D. Terpstra, V. M. Weaver, and J. Dongarra, “Evaluation of the HPC Challenge Benchmarks in Virtualized Environments,” 6th Workshop on Virtualization in High-Performance Cloud Computing, Bordeaux, France, August 2011.

(114.73 KB)

Main menu

Publications

Pages