Publications

Export 831 results:
Filters: Author is Jack Dongarra  [Clear All Filters]
2017
Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Ortí, Batched Gauss-Jordan Elimination for Block-Jacobi Preconditioner Generation on GPUs,” Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, New York, NY, USA, ACM, pp. 1–10, February 2017. DOI: 10.1145/3026937.3026940  (552.62 KB)
Faverge, M., J. Langou, Y. Robert, and J. Dongarra, Bidiagonalization and R-Bidiagonalization: Parallel Tiled Algorithms, Critical Paths and Distributed-Memory Implementation,” IEEE International Parallel and Distributed Processing Symposium (IPDPS), Orlando, FL, IEEE, May 2017. DOI: 10.1109/IPDPS.2017.46  (328.15 KB)
Anzt, H., J. Dongarra, M. Gates, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, Bringing High Performance Computing to Big Data Algorithms,” Handbook of Big Data Technologies: Springer, 2017. DOI: 10.1007/978-3-319-49340-4
Abdelfattah, A., K. Arturov, C. Cecka, J. Dongarra, C. Freitag, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, et al., C++ API for Batch BLAS,” SLATE Working Notes, no. 4, ICL-UT-17-12: University of Tennessee, December 2017.  (1.89 MB)
Gates, M., P. Luszczek, A. Abdelfattah, J. Kurzak, J. Dongarra, K. Arturov, C. Cecka, and C. Freitag, C++ API for BLAS and LAPACK,” SLATE Working Notes, no. 2, ICL-UT-17-03: Innovative Computing Laboratory, University of Tennessee, June 2017.  (1.12 MB)
Fayad, D., J. Kurzak, P. Luszczek, P. Wu, and J. Dongarra, The Case for Directive Programming for Accelerator Autotuner Optimization,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-07: University of Tennessee, October 2017.  (341.52 KB)
Yamazaki, I., M. Hoemmen, P. Luszczek, and J. Dongarra, Comparing performance of s-step and pipelined GMRES on distributed-memory multicore CPUs , Pittsburgh, Pennsylvania, SIAM Annual Meeting, July 2017.  (748 KB)
Kurzak, J., P. Luszczek, I. Yamazaki, Y. Robert, and J. Dongarra, Design and Implementation of the PULSAR Programming System for Large Scale Computing,” Supercomputing Frontiers and Innovations, vol. 4, issue 1, 2017. DOI: 10.14529/jsfi170101
Dongarra, J., S. Hammarling, N. Higham, S. Relton, P. Valero-Lara, and M. Zounon, The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems,” International Conference on Computational Science (ICCS 2017), Zürich, Switzerland, Elsevier, June 2017. DOI: DOI:10.1016/j.procs.2017.05.138
Kurzak, J., P. Wu, M. Gates, I. Yamazaki, P. Luszczek, G. Ragghianti, and J. Dongarra, Designing SLATE: Software for Linear Algebra Targeting Exascale,” SLATE Working Notes, no. 3, ICL-UT-17-06: Innovative Computing Laboratory, University of Tennessee, October 2017.  (2.8 MB)
Hoque, R., T. Herault, G. Bosilca, and J. Dongarra, Dynamic Task Discovery in PaRSEC- A data-flow task-based Runtime,” ScalA17, Denver, ACM, September 2017. DOI: 10.1145/3148226.3148233  (1.15 MB)
M. Lopez, G., V. Larrea, W. Joubert, O. Hernandez, A. Haidar, S. Tomov, and J. Dongarra, Evaluation of Directive-based Performance Portable Programming Models,” International Journal of High Performance Computing and Networking (IJHPCN), vol. (In Press).
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Factorization and Inversion of a Million Matrices using GPUs: Challenges and Countermeasures,” Procedia Computer Science, vol. 108, pp. 606–615, June 2017. DOI: 10.1016/j.procs.2017.05.250
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Fast Cholesky Factorization on GPUs for Batch and Native Modes in MAGMA,” Journal of Computational Science, vol. 20, pp. 85–93, May 2017. DOI: 10.1016/j.jocs.2016.12.009
Anzt, H., G. Collins, J. Dongarra, G. Flegar, and E. S. Quintana-Ortí, Flexible Batched Sparse Matrix Vector Product on GPUs , Denver, Colorado, ScalA'17: 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, November 2017.  (16.8 MB)
Anzt, H., G. Collins, J. Dongarra, G. Flegar, and E. S. Quintana-Ortí, Flexible Batched Sparse Matrix-Vector Product on GPUs,” Proceedings of the 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA '17), Denver, Colorado, ACM Press, November 2017. DOI: 10.1145/3148226.3148230
Anzt, H., G. Collins, J. Dongarra, G. Flegar, and E. S. Quintana-Ortí, Flexible Batched Sparse Matrix-Vector Product on GPUs,” Proceedings of the 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA '17), Denver, Colorado, ACM Press, November 2017. DOI: 10.1145/3148226.3148230
Kabir, K., A. Haidar, S. Tomov, A. Bouteiller, and J. Dongarra, A Framework for Out of Memory SVD Algorithms,” ISC High Performance 2017, pp. 158–178, June 2017. DOI: 10.1007/978-3-319-58667-0_9  (393.22 KB)
Haidar, A., A. Abdelfattah, S. Tomov, and J. Dongarra, High-performance Cholesky Factorization for GPU-only Execution,” Proceedings of the General Purpose GPUs (GPGPU-10), Austin, TX, ACM, February 2017.  (872.18 KB)
Yamazaki, I., M. Hoemmen, P. Luszczek, and J. Dongarra, Improving Performance of GMRES by Reducing Communication and Pipelining Global Collectives,” Proceedings of The 18th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2017), Best Paper Award, Orlando, FL, June 2017.  (453.66 KB)
Haidar, A., P. Wu, S. Tomov, and J. Dongarra, Investigating Half Precision Arithmetic to Accelerate Dense Linear System Solvers,” ScalA17: 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Denver, CO, ACM.  (766.35 KB)
Yamazaki, I., and J. Dongarra, LAWN 294: Aasen's Symmetric Inde nite Linear Solvers in LAPACK,” LAPACK Working Note, no. LAWN 294, ICL-UT-17-13: University of Tennessee, December 2017.  (854.1 KB)
Bell, G., D. Bailey, A. H. Karp, J. Dongarra, and K. Walsh, A Look Back on 30 Years of the Gordon Bell Prize,” International Journal of High Performance Computing and Networking, vol. 31, issue 6, pp. 469–484, 2017.
Ng, L., K. Wong, A. Haidar, S. Tomov, and J. Dongarra, MagmaDNN – High-Performance Data Analytics for Manycore GPUs and CPUs , Knoxville, TN, 2017 Summer Research Experiences for Undergraduate (REU), Presentation, December 2017.  (5.06 MB)
Anzt, H., E. Boman, J. Dongarra, G. Flegar, M. Gates, M. Heroux, M. Hoemmen, J. Kurzak, P. Luszczek, S. Rajamanickam, et al., MAGMA-sparse Interface Design Whitepaper,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-05, September 2017.  (1.28 MB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Novel HPC Techniques to Batch Execution of Many Variable Size BLAS Computations on GPUs,” International Conference on Supercomputing (ICS '17), Chicago, Illinois, ACM, June 2017. DOI: 10.1145/3079079.3079103
Dongarra, J., S. Hammarling, N. Higham, S. Relton, and M. Zounon, Optimized Batched Linear Algebra for Modern Architectures,” Euro-Par 2017, Santiago de Compostela, Spain, Springer, August 2017. DOI: 10.1007/978-3-319-64203-1_37
Dong, T., A. Haidar, S. Tomov, and J. Dongarra, Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices,” International Conference on Computational Science (ICCS 2017), Zurich, Switzerland, Procedia Computer Science, June 2017.  (364.95 KB)
Haidar, A., K. Kabir, D. Fayad, S. Tomov, and J. Dongarra, Out of Memory SVD Solver for Big Data,” 2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Waltham, MA, IEEE, September 2017.  (1.33 MB)
Abalenkovs, M., N. Bagherpour, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Relton, J. Sistek, D. Stevens, et al., PLASMA 17 Performance Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-11: University of Tennessee, June 2017.  (7.57 MB)
Abalenkovs, M., N. Bagherpour, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Relton, J. Sistek, D. Stevens, et al., PLASMA 17.1 Functionality Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-10: University of Tennessee, June 2017.  (1.8 MB)
Dongarra, J., A. Haidar, O. Hernandez, S. Tomov, and M. Grentla Venkata, POMPEI: Programming with OpenMP4 for Exascale Investigations,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-09: University of Tennessee, December 2017.  (1.1 MB)
Haidar, A., H. Jagode, A. YarKhan, P. Vaccaro, S. Tomov, and J. Dongarra, Power-aware Computing: Measurement, Control, and Performance Analysis for Intel Xeon Phi,” 2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Best Paper Finalist, Waltham, MA, IEEE, September 2017.  (908.84 KB)
Haidar, A., H. Jagode, A. YarKhan, P. Vaccaro, S. Tomov, and J. Dongarra, Power-Aware HPC on Intel Xeon Phi KNL Processors , Frankfurt, Germany, ISC High Performance (ISC17), Intel Booth Presentation, June 2017.  (5.87 MB)
Anzt, H., M. Gates, J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, Preconditioned Krylov Solvers on GPUs,” Parallel Computing, June 2017. DOI: 10.1016/j.parco.2017.05.006
Dongarra, J., Report on the TianHe-2A System,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-04: University of Tennessee, September 2017.  (7.15 MB)
Abdelfattah, A., H. Anzt, A. Bouteiller, A. Danalis, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, et al., Roadmap for the Development of a Linear Algebra Library for Exascale Computing: SLATE: Software for Linear Algebra Targeting Exascale,” SLATE Working Notes, no. 1, ICL-UT-17-02: Innovative Computing Laboratory, University of Tennessee, June 2017.  (2.8 MB)
Yamazaki, I., S. Tomov, and J. Dongarra, Sampling Algorithms to Update Truncated SVD,” IEEE International Conference on Big Data, December 2017.
Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, A. Haidar, I. Karlin, T. Kolev, I. Masliah, and S. Tomov, Small Tensor Operations on Advanced Architectures for High-Order Applications,” University of Tennessee Computer Science Technical Report, no. UT-EECS-17-749: Innovative Computing Laboratory, University of Tennessee, April 2017.  (1.09 MB)
Baboulin, M., J. Dongarra, A. Remy, S. Tomov, and I. Yamazaki, Solving Dense Symmetric Indefinite Systems using GPUs,” Concurrency and Computation: Practice and Experience, vol. 29, issue 9, March 2017. DOI: 10.1002/cpe.4055  (1.94 MB)
Yamazaki, I., S. Nooshabadi, S. Tomov, and J. Dongarra, Structure-aware Linear Solver for Realtime Convex Optimization for Embedded Systems,” IEEE Embedded Systems Letters, vol. 9, issue 3, pp. 61–64, May 2017. DOI: 10.1109/LES.2017.2700401  (339.11 KB)
Anzt, H., J. Dongarra, G. Flegar, E. S. Quintana-Ortí, and A. E. Thomas, Variable-Size Batched Gauss-Huard for Block-Jacobi Preconditioning,” International Conference on Computational Science (ICCS 2017), vol. 108, Zurich, Switzerland, Procedia Computer Science, pp. 1783-1792, June 2017.
Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Ortí, Variable-Size Batched LU for Small Matrices and Its Integration into Block-Jacobi Preconditioning,” 46th International Conference on Parallel Processing (ICPP), Bristol, United Kingdom, IEEE, August 2017. DOI: 10.1109/ICPP.2017.18
2018
Dongarra, J., V. Getov, and K. Walsh, The 30th Anniversary of the Supercomputing Conference: Bringing the Future Closer—Supercomputing History and the Immortality of Now,” Computer, vol. 51, issue 10, pp. 74–85, November 2018. DOI: 10.1109/MC.2018.3971352
Jagode, H., A. Danalis, and J. Dongarra, Accelerating NWChem Coupled Cluster through dataflow-based Execution,” The International Journal of High Performance Computing Applications, vol. 32, issue 4, pp. 540--551, July 2018. DOI: 10.1177/1094342016672543  (1.68 MB)
Dong, T., A. Haidar, S. Tomov, and J. Dongarra, Accelerating the SVD Bi-Diagonalization of a Batch of Small Matrices using GPUs,” Journal of Computational Science, vol. 26, pp. 237–245, May 2018. DOI: 10.1016/j.jocs.2018.01.007  (2.18 MB)
Gates, M., S. Tomov, and J. Dongarra, Accelerating the SVD Two Stage Bidiagonal Reduction and Divide and Conquer Using GPUs,” Parallel Computing, vol. 74, pp. 3–18, May 2018. DOI: 10.1016/j.parco.2017.10.004
Luo, X., W. Wu, G. Bosilca, T. Patinyasakdikul, L. Wang, and J. Dongarra, ADAPT: An Event-Based Adaptive Collective Communication Framework,” The 27th International Symposium on High-Performance Parallel and Distributed Computing (HPDC '18), Tempe, Arizona, ACM Press, June 2018. DOI: 10.1145/3208040.3208054  (493.65 KB)
Anzt, H., J. Dongarra, G. Flegar, N. J. Higham, and E. S. Quintana-Ortí, Adaptive Precision in Block‐Jacobi Preconditioning for Iterative Sparse Linear System Solvers,” Concurrency Computation: Practice and Experience, March 2018. DOI: 10.1002/cpe.4460
Masliah, I., A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, and J. Dongarra, Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-09: Innovative Computing Laboratory, University of Tennessee, September 2018.  (3.74 MB)

Pages