Publications

Export 877 results:
Filters: Author is Jack Dongarra  [Clear All Filters]
2020
Lopez, F., E. Chow, S. Tomov, and J. Dongarra, Asynchronous SGD for DNN Training on Shared-Memory Parallel Architectures,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-04: University of Tennessee, Knoxville, March 2020.  (188.51 KB)
Lopez, F., E. Chow, S. Tomov, and J. Dongarra, Asynchronous SGD for DNN training on Shared-memory Parallel Architectures,” Workshop on Scalable Deep Learning over Parallel And Distributed Infrastructures (ScaDL 2020), May 2020.  (188.51 KB)
Pei, Y., Q. Cao, G. Bosilca, P. Luszczek, V. Eijkhout, and J. Dongarra, Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based Runtime,” 21st IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2020), New Orleans, LA, IEEE, May 2020.  (1.33 MB)
Cao, Q., Y. Pei, K. Akbudak, A. Mikhalev, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications,” Platform for Advanced Scientific Computing Conference (PASC20), Geneva, Switzerland, ACM, June 2020. DOI: 10.1145/3394277.3401846  (2.71 MB)
Tomov, S., A. Ayala, A. Haidar, and J. Dongarra, FFT-ECP API and High-Performance Library Prototype for 2-D and 3-D FFTs on Large-Scale Heterogeneous Systems with GPUs , no. FFT-ECP STML13-27: Innovative Computing Laboratory, University of Tennessee, January 2020.  (9.71 MB)
Jagode, H., A. Danalis, and J. Dongarra, Formulation of Requirements for new PAPI++ Software Package: Part I: Survey Results,” PAPI++ Working Notes, no. No. 1, ICL-UT-20-02: Innovative Computing Laboratory, University of Tennessee Knoxville, January 2020.  (1.49 MB)
Beckman, P., J. Dongarra, N. Ferrier, G. Fox, T. Moore, D. Reed, and M. Beck, Harnessing the Computing Continuum for Programming Our World,” Fog Computing: Theory and Practice: John Wiley & Sons, Inc., 2020. DOI: 10.1002/9781119551713.ch7  (1.4 MB)
Ayala, A., S. Tomov, A. Haidar, and J. Dongarra, heFFTe: Highly Efficient FFT for Exascale,” International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, June 2020. DOI: 10.1007/978-3-030-50371-0_19  (2.62 MB)
Brown, C., A. Abdelfattah, S. Tomov, and J. Dongarra, hipMAGMA v1.0 : Zenodo, March 2020. DOI: 10.5281/zenodo.3908549
Brown, C., A. Abdelfattah, S. Tomov, and J. Dongarra, hipMAGMA v2.0 : Zenodo, July 2020. DOI: 10.5281/zenodo.3928667
Lindquist, N., P. Luszczek, and J. Dongarra, Improving the Performance of the GMRES method using Mixed-Precision Techniques,” Smoky Mountains Computational Sciences & Engineering Conference (SMC2020), August 2020.
Abdelfattah, A., S. Tomov, and J. Dongarra, Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices using GPUs,” International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, Springer, Cham, June 2020. DOI: 10.1007/978-3-030-50417-5_18  (702.38 KB)
Anzt, H., Y-C. Chen, T. Cojean, J. Dongarra, G. Flegar, R. Nayak, E. S. Quintana-Orti, Y. Tsai, and W. Wang, Load-balancing Sparse Matrix Vector Product Kernels on GPUs,” ACM Transactions on Parallel Computing, issue 2, March 2020. DOI: 10.1145/3380930  (5.64 MB)
Haidar, A., H. Bayraktar, S. Tomov, J. Dongarra, and N. J. Higham, Mixed-Precision Solution of Linear Systems Using Accelerator-Based Computing,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-05: University of Tennessee, May 2020.  (1.03 MB)
Dongarra, J., L. Grigori, and N. J. Higham, Numerical Algorithms for High-Performance Computational Science,” Philosophical Transactions of the Royal Society A, vol. 378, issue 2166, 2020. DOI: 10.1098/rsta.2019.0066  (724.37 KB)
Gates, M., A. Charara, A. YarKhan, D. Sukkari, M. Al Farhan, and J. Dongarra, Performance Tuning SLATE,” SLATE Working Notes, no. 14, ICL-UT-20-01: Innovative Computing Laboratory, University of Tennessee, January 2020.  (1.29 MB)
Wong, K., S. Tomov, and J. Dongarra, Project-Based Research and Training in High Performance Data Sciences, Data Analytics, and Machine Learning,” The Journal of Computational Science Education, vol. 11, issue 1, pp. 36-44, January 2020. DOI: 10.22369/issn.2153-4136/11/1/7  (4.4 MB)
Lu, Y., I. Yamazaki, F. Ino, Y. Matsushita, S. Tomov, and J. Dongarra, Reducing the Amount of out-of-core Data Access for GPU-Accelerated Randomized SVD,” Concurrency and Computation: Practice and Experience, April 2020. DOI: 10.1002/cpe.5754  (1.43 MB)
Dongarra, J., Report on the Fujitsu Fugaku System,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-06: University of Tennessee, June 2020.  (3.3 MB)
Gates, M., J. Kurzak, A. YarKhan, A. Charara, J. Finney, D. Sukkari, M. Al Farhan, I. Yamazaki, P. Wu, and J. Dongarra, SLATE Tutorial , Houston, TX, 2020 ECP Annual Meeting, February 2020.  (12.14 MB)
Gates, M., A. Charara, J. Kurzak, A. YarKhan, M. Al Farhan, D. Sukkari, and J. Dongarra, SLATE Users' Guide,” SLATE Working Notes, no. 10, ICL-UT-19-01: Innovative Computing Laboratory, University of Tennessee, July 2020.  (2.35 MB)
Krzhizhanovskaya, V., G. Závodszky, M. Lees, J. Dongarra, P. Sloot, S. Brissos, and J. Teixeira, Twenty Years of Computational Science,” International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, June 2020.  (149.66 KB)
Zhong, D., P. Shamis, Q. Cao, G. Bosilca, and J. Dongarra, Using Arm Scalable Vector Extension to optimize Open MPI,” 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID 2020), Melbourne, Australia, IEEE/ACM, May 2020.
2019
Anzt, H., J. Dongarra, G. Flegar, N. J. Higham, and E. S. Quintana-Orti, Adaptive Precision in Block-Jacobi Preconditioning for Iterative Sparse Linear System Solvers,” Concurrency and Computation: Practice and Experience, vol. 31, no. 6, pp. e4460, March 2019. DOI: 10.1002/cpe.4460  (341.54 KB)
Masliah, I., A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, and J. Dongarra, Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,” Parallel Computing, vol. 81, pp. 1–21, January 2019. DOI: 10.1016/j.parco.2018.10.003  (3.27 MB)
Tomov, S., A. Abdelfattah, V. Barra, N. Beams, J. Brown, J-S. Camier, V. Dobrev, J. Dongarra, Y. Dudouit, P. Fischer, et al., CEED ECP Milestone Report: Performance Tuning of CEED Software and 1st and 2nd Wave Apps : Zenodo, October 2019. DOI: 10.5281/zenodo.3477618  (8.31 MB)
Davis, J., T. Gao, S. Chandrasekaran, H. Jagode, A. Danalis, P. Balaji, J. Dongarra, and M. Taufer, Characterization of Power Usage and Performance in Data-Intensive Applications using MapReduce over MPI,” 2019 International Conference on Parallel Computing (ParCo2019), Prague, Czech Republic, September 2019.
Herault, T., Y. Robert, A. Bouteiller, D. Arnold, K. Ferreira, G. Bosilca, and J. Dongarra, Checkpointing Strategies for Shared High-Performance Computing Platforms,” International Journal of Networking and Computing, vol. 9, no. 1, pp. 28–52, 2019.  (490.5 KB)
Le Fèvre, V., T. Herault, Y. Robert, A. Bouteiller, A. Hori, G. Bosilca, and J. Dongarra, Comparing the Performance of Rigid, Moldable, and Grid-Shaped Applications on Failure-Prone HPC Platforms,” Parallel Computing, vol. 85, pp. 1–12, July 2019. DOI: 10.1016/j.parco.2019.02.002  (865.18 KB)
Danalis, A., H. Jagode, H. Hanumantharayappa, S. Ragate, and J. Dongarra, Counter Inspection Toolkit: Making Sense out of Hardware Performance Events,” 11th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Cham, Switzerland: Springer, February 2019. DOI: 10.1007/978-3-030-11987-4_2  (216.39 KB)
Tomov, S., A. Haidar, A. Ayala, D. Schultz, and J. Dongarra, Design and Implementation for FFT-ECP on Distributed Accelerated Systems,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-05: University of Tennessee, April 2019.  (3.19 MB)
Yamazaki, I., A. Ida, R. Yokota, and J. Dongarra, Distributed-Memory Lattice H-Matrix Factorization,” The International Journal of High Performance Computing Applications, vol. 33, issue 5, pp. 1046–1063, August 2019. DOI: 10.1177/1094342019861139  (1.14 MB)
Danalis, A., H. Jagode, and J. Dongarra, Does your tool support PAPI SDEs yet? , Tahoe City, CA, 13th Scalable Tools Workshop, July 2019.  (3.09 MB)
YarKhan, A., J. Kurzak, A. Abdelfattah, and J. Dongarra, An Empirical View of SLATE Algorithms on Scalable Hybrid System,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-08: University of Tennessee, Knoxville, September 2019.  (441.16 KB)
Lopez, M. G., W. Joubert, V. Larrea, O. Hernandez, A. Haidar, S. Tomov, and J. Dongarra, Evaluation of Directive-Based Performance Portable Programming Models,” International Journal of High Performance Computing and Networking, vol. 14, issue 2, pp. 165-182. DOI: http://dx.doi.org/10.1504/IJHPCN.2017.10009064  (1.12 MB)
Pei, Y., G. Bosilca, I. Yamazaki, A. Ida, and J. Dongarra, Evaluation of Programming Models to Address Load Imbalance on Distributed Multi-Core CPUs: A Case Study with Block Low-Rank Factorization,” PAW-ATM Workshop at SC19, Denver, CO, ACM, November 2019.  (4.51 MB)
Abdelfattah, A., S. Tomov, and J. Dongarra, Fast Batched Matrix Multiplication for Small Sizes using Half Precision Arithmetic on GPUs,” 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.  (675.5 KB)
Tomov, S., A. Haidar, A. Ayala, D. Schultz, and J. Dongarra, FFT-ECP Fast Fourier Transform , Houston, TX, 2019 ECP Annual Meeting (Research Poster), January 2019.  (1.51 MB)
Tomov, S., A. Haidar, A. Ayala, H. Shaiek, and J. Dongarra, FFT-ECP Implementation Optimizations and Features Phase,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-12: University of Tennessee, October 2019.  (4.14 MB)
Herault, T., Y. Robert, G. Bosilca, and J. Dongarra, Generic Matrix Multiplication for Multi-GPU Accelerated Distributed-Memory Platforms over PaRSEC,” ScalA'19: 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Denver, CO, IEEE, November 2019.  (260.69 KB)
Shaiek, H., S. Tomov, A. Ayala, A. Haidar, and J. Dongarra, GPUDirect MPI Communications and Optimizations to Accelerate FFTs on Exascale Systems,” EuroMPI'19 Posters, Zurich, Switzerland, no. icl-ut-19-06: ICL, September 2019.  (2.25 MB)
Wong, K., S. Tomov, and J. Dongarra, Hands-on Research and Training in High-Performance Data Sciences, Data Analytics, and Machine Learning for Emerging Environments,” ISC High Performance, Frankfurt, Germany, Springer International Publishing, June 2019.  (1016.52 KB)
Ayala, A., S. Tomov, X. Luo, H. Shaiek, A. Haidar, G. Bosilca, and J. Dongarra, Impacts of Multi-GPU MPI Collective Communications on Large FFT Computation,” Workshop on Exascale MPI (ExaMPI) at SC19, Denver, CO, November 2019.  (1.6 MB)
Luszczek, P., I. Yamazaki, and J. Dongarra, Increasing Accuracy of Iterative Refinement in Limited Floating-Point Arithmetic on Half-Precision Accelerators,” IEEE High Performance Extreme Computing Conference (HPEC 2019), Best Paper Finalist, Waltham, MA, IEEE, September 2019.  (470.21 KB)
Kurzak, J., M. Gates, A. Charara, A. YarKhan, and J. Dongarra, Least Squares Solvers for Distributed-Memory Machines with GPU Accelerators,” ACM International Conference on Supercomputing (ICS '19), Phoenix, Arizona, ACM, pp. 117–126, June 2019. DOI: 10.1145/3324989.3325719  (1.63 MB)
Kurzak, J., M. Gates, A. Charara, A. YarKhan, I. Yamazaki, and J. Dongarra, Linear Systems Solvers for Distributed-Memory Machines with GPU Accelerators,” Euro-Par 2019: Parallel Processing, vol. 11725: Springer, pp. 495–506, August 2019. DOI: 10.1007/978-3-030-29400-7_35
Ng, L., S. Chen, A. Gessinger, D. Nichols, S. Cheng, A. Meenasorna, K. Wong, S. Tomov, A. Haidar, E. D'Azevedo, et al., MagmaDNN 0.2 High-Performance Data Analytics for Manycore GPUs and CPUs : University of Tennessee, January 2019. DOI: 10.13140/RG.2.2.14906.64961  (7.84 MB)
Nichols, D., N-S. Tomov, F. Betancourt, S. Tomov, K. Wong, and J. Dongarra, MagmaDNN: Towards High-Performance Data Analytics and Machine Learning for Data-Driven Scientific Computing,” ISC High Performance, Frankfurt, Germany, Springer International Publishing, June 2019.  (1.37 MB) (8.72 MB)
Kurzak, J., Y. Tsai, M. Gates, A. Abdelfattah, and J. Dongarra, Massively Parallel Automated Software Tuning,” 48th International Conference on Parallel Processing (ICPP 2019), Kyoto, Japan, ACM Press, August 2019. DOI: 10.1145/3337821.3337908  (911.88 KB)
Bai, Z., J. Dongarra, D. Lu, and I. Yamazaki, Matrix Powers Kernels for Thick-Restart Lanczos with Explicit External Deflation,” International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.  (480.73 KB)

Pages