Publications

Export 973 results:
2019
Anzt, H., J. Dongarra, G. Flegar, N. J. Higham, and E. S. Quintana-Orti, Adaptive precision in block-Jacobi preconditioning for iterative sparse linear system solvers,” Concurrency and Computation: Practice and Experience, vol. 31, no. 6, pp. e4460, 2019.  (341.54 KB)
Masliah, I., A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, and J. Dongarra, Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,” Parallel Computing, vol. 81, pp. 1–21, January 2019.
Herault, T., Y. Robert, A. Bouteiller, D. Arnold, K. Ferreira, G. Bosilca, and J. Dongarra, Checkpointing Strategies for Shared High-Performance Computing Platforms,” International Journal of Networking and Computing, vol. 9, no. 1, pp. 28–52, 2019.
Benoit, A., A. Cavelan, F. M. Ciorba, V. Le Fèvre, and Y. Robert, Combining checkpointing and replication for reliable execution of linear workflows with fail-stop and silent errors,” International Journal of Networking and Computing, vol. 9, no. 1, pp. 2-27, 2019.  (754.6 KB)
Le Fèvre, V., T. Herault, Y. Robert, A. Bouteiller, A. Hori, G. Bosilca, and J. Dongarra, Comparing the Performance of Rigid, Moldable, and Grid-Shaped Applications on Failure-Prone HPC Platforms,” Parallel Computing, vol. 85, pp. 1–12, July 2019.  (865.18 KB)
Kaya, O., and Y. Robert, Computing dense tensor decompositions with optimal dimension trees,” Algorithmica, to appear, 2019.  (638.4 KB)
Aupy, G., A. Benoit, B. Goglin, L. Pottier, and Y. Robert, Co-scheduling HPC workloads on cache-partitioned CMP platforms,” Int. Journal of High Performance Computing Applications, To appear, 2019.  (930.28 KB)
Tomov, S., A. Haidar, A. Ayala, D. Schultz, and J. Dongarra, Design and Implementation for FFT-ECP on Distributed Accelerated Systems,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-05: University of Tennessee, April 2019.  (3.19 MB)
Bosilca, G., A. Bouteiller, T. Herault, V. Le Fèvre, Y. Robert, and J. Dongarra, Distributed Termination Detection for HPC Task-Based Environments,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-14: University of Tennessee, June 2018, 2019.
M. Lopez, G., W. Joubert, V. Larrea, O. Hernandez, A. Haidar, S. Tomov, and J. Dongarra, Evaluation of Directive-Based Performance Portable Programming Models,” International Journal of High Performance Computing and Networking (to appear), 2019.
Abdelfattah, A., S. Tomov, and J. Dongarra, Fast Batched Matrix Multiplication for Small Sizes using Half Precision Arithmetic on GPUs,” 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS),, Rio de Janeiro, Brazil, IEEE, May 2019.
Tomov, S., A. Haidar, A. Ayala, D. Schultz, and J. Dongarra, FFT-ECP Fast Fourier Transform , Houston, TX, 2019 ECP Annual Meeting (Research Poster), January 2019.  (1.51 MB)
Han, L., V. Le Fèvre, L-C. Canon, Y. Robert, and F. Vivien, A Generic Approach to Scheduling and Checkpointing Workflows,” Int. Journal of High Performance Computing Applications, To appear, 2019.  (555.01 KB)
Wong, K., S. Tomov, and J. Dongarra, Hands-on Research and Training in High-Performance Data Sciences, Data Analytics, and Machine Learning for Emerging Environments,,” ISC High Performance, Frankfurt, Germany, Springer International Publishing, June 2019.  (1016.52 KB)
Beck, M., T. Moore, and P. Luszczek, Interoperable Convergence of Storage, Networking, and Computation,” Future of Information and Communication Conference (FICC), San Francisco, Science and Information (SAI), March 2019.  (1.8 MB)
Beck, M., T. Moore, P. Luszczek, and A. Danalis, Interoperable Convergence of Storage, Networking, and Computation,” FICC 2019, San Francisco, CA, Springer, March 14-15, 2019.  (2.64 MB)
Losada, N., G. Bosilca, A. Bouteiller, P. González, and M. J. Martín, Local Rollback for Resilient MPI Applications with Application-Level Checkpointing and Message Logging,” Future Generation Computer Systems, vol. 91, pp. 450-464, February 2019.  (1.16 MB)
Nichols, D., K. Wong, S. Tomov, L. Ng, S. Chen, and A. Gessinger, MagmaDNN: Accelerated Deep Learning Using MAGMA,” Practice and Experience in Advanced Research Computing (PEARC ’19), Chicago, IL, ACM, July 2019.  (1.09 MB)
Nichols, D., N-S. Tomov, F. Betancourt, S. Tomov, K. Wong, and J. Dongarra, MagmaDNN: Towards High-Performance Data Analytics and Machine Learning for Data-Driven Scientific Computing,” ISC High Performance, Frankfurt, Germany, Springer International Publishing, June 2019.  (1.37 MB)
Bai, Z., J. Dongarra, D. Lu, and I. Yamazaki, Matrix Powers Kernels for Thick-Restart Lanczos with Explicit External Deflation,” International Parallel and Distributed Processing Symposium (IPDPS), May 2019.
Betancourt, F., K. Wong, E. Asemota, Q. Marshall, D. Nichols, and S. Tomov, OpenDIEL: A Parallel Workflow Engine and DataAnalytics Framework,” Practice and Experience in Advanced Research Computing (PEARC ’19), Chicago, IL, ACM, July 2019.  (1.48 MB)
Yamazaki, I., E. Chow, A. Bouteiller, and J. Dongarra, Performance of Asynchronous Optimized Schwarz with One-sided Communication,” Parallel Computing, vol. 86, pp. 66–81, August 2019.
Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, P. Wu, I. Yamazaki, A. YarKhan, M. Abalenkovs, N. Bagherpour, et al., PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP,” ACM Transactions on Mathematical Software (to appear), 2019.  (7.5 MB)
Aupy, G., A. Gainaru, V. Honoré, P. Raghavan, Y. Robert, and H. Sun, Reservation strategies for stochastic jobs,” IPDPS'2019, the 33st IEEE International Parallel and Distributed Processing Symposium: IEEE Computer Society Press, 2019.  (808.93 KB)
Gao, Y., L-C. Canon, Y. Robert, and F. Vivien, Scheduling independent stochastic tasks on heterogeneous cloud platforms,” Cluster 2019: IEEE Computer Society Press, 2019.  (651 KB)
Canon, L-C., A. Kong Win Chang, Y. Robert, and F. Vivien, Scheduling independent stochastic tasks under deadline and budget constraints,” Int. Journal of High Performance Computing Applications, vol. To appear, 2019.  (427.92 KB)
Charara, A., M. Gates, J. Kurzak, and J. Dongarra, SLATE Developers' Guide,” SLATE Working Notes, no. 11, ICL-UT-19-02: Innovative Computing Laboratory, University of Tennessee, January 2019.
Charara, A., J. Dongarra, M. Gates, J. Kurzak, and A. YarKhan, SLATE Mixed Precision Performance Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-03: University of Tennessee, April 2019.  (1.04 MB)
Gates, M., A. Charara, J. Kurzak, and J. Dongarra, SLATE Users' Guide,” SLATE Working Notes, no. 10, ICL-UT-19-01: Innovative Computing Laboratory, University of Tennessee, January 2019.
Kurzak, J., M. Gates, A. Charara, A. YarKhan, and J. Dongarra, SLATE Working Note 12: Implementing Matrix Inversions,” SLATE Working Notes, no. 12, ICL-UT-19-04: Innovative Computing Laboratory, University of Tennessee, June 2019.  (1.61 MB)
Hori, A., Y. Tsujita, A. Shimada, K. Yoshinaga, N. Mitaro, G. Fukazawa, M. Sato, G. Bosilca, A. Bouteiller, and T. Herault, System Software for Many-Core and Multi-core Architecture,” Advanced Software Technologies for Post-Peta Scale Computing: The Japanese Post-Peta CREST Research Project, Singapore, Springer Singapore, pp. 59–75, 2019.
Anzt, H., G. Flegar, T. Grützmacher, and E. S. Quintana-Ortí, Toward a Modular Precision Ecosystem for High-Performance Computing,” The International Journal of High Performance Computing Applications, September 2019.  (1.93 MB)
Anzt, H., Y-C. Chen, T. Cojean, J. Dongarra, G. Flegar, P. Nayak, E. S. Quintana-Orti, Y. M. Tsai, W. Wang, and , Towards Continuous Benchmarking,” the Platform for Advanced Scientific Computing ConferenceProceedings of the Platform for Advanced Scientific Computing Conference on - PASC '19, Zurich, SwitzerlandNew York, New York, USA, ACM Press, 2019.  (1.51 MB)
2018
Dongarra, J., V. Getov, and K. Walsh, The 30th Anniversary of the Supercomputing Conference: Bringing the Future Closer—Supercomputing History and the Immortality of Now,” Computer, vol. 51, issue 10, pp. 74–85, November 2018.
Cheng, X., A. Soma, E. D'Azevedo, K. Wong, and S. Tomov, Accelerating 2D FFT: Exploit GPU Tensor Cores through Mixed-Precision , Dallas, TX, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC18), ACM Student Research Poster, November 2018.  (740.37 KB)
Tomov, S., M. Gates, and A. Haidar, Accelerating Linear Algebra with MAGMA , Knoxville, TN, ECP Annual Meeting 2018, Tutorial, February 2018.  (35.27 MB)
Jagode, H., A. Danalis, and J. Dongarra, Accelerating NWChem Coupled Cluster through dataflow-based Execution,” The International Journal of High Performance Computing Applications, vol. 32, issue 4, pp. 540--551, July 2018.  (1.68 MB)
Dong, T., A. Haidar, S. Tomov, and J. Dongarra, Accelerating the SVD Bi-Diagonalization of a Batch of Small Matrices using GPUs,” Journal of Computational Science, vol. 26, pp. 237–245, May 2018.
Gates, M., S. Tomov, and J. Dongarra, Accelerating the SVD Two Stage Bidiagonal Reduction and Divide and Conquer Using GPUs,” Parallel Computing, vol. 74, pp. 3–18, May 2018.
Luo, X., W. Wu, G. Bosilca, T. Patinyasakdikul, L. Wang, and J. Dongarra, ADAPT: An Event-Based Adaptive Collective Communication Framework,” The 27th International Symposium on High-Performance Parallel and Distributed Computing (HPDC '18), Tempe, Arizona, ACM Press, June 2018.  (493.65 KB)
Anzt, H., J. Dongarra, G. Flegar, N. J. Higham, and E. S. Quintana-Ortí, Adaptive Precision in Block‐Jacobi Preconditioning for Iterative Sparse Linear System Solvers,” Concurrency Computation: Practice and Experience, March 2018.
Masliah, I., A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, and J. Dongarra, Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-09: Innovative Computing Laboratory, University of Tennessee, September 2018.  (3.74 MB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs,” IEEE Transactions on Parallel and Distributed Systems, vol. 29, issue 12, pp. 2700–2712, December 2018.
Yamazaki, I., A. Abdelfattah, A. Ida, S. Ohshima, S. Tomov, R. Yokota, and J. Dongarra, Analyzing Performance of BiCGStab with Hierarchical Matrix on GPU Clusters,” IEEE International Parallel and Distributed Processing Symposium (IPDPS), Vancouver, BC, Canada, IEEE, May 2018.  (1.37 MB)
Balaprakash, P., J. Dongarra, T. Gamblin, M. Hall, J. Hollingsworth, B. Norris, and R. Vuduc, Autotuning in High-Performance Computing Applications,” Proceedings of the IEEE, vol. 106, issue 11, pp. 2068–2083, November 2018.
Dongarra, J., M. Gates, J. Kurzak, P. Luszczek, and Y. Tsai, Autotuning Numerical Dense Linear Algebra for Batched Computation With GPU Hardware Accelerators,” Proceedings of the IEEE, vol. 106, issue 11, pp. 2040–2055, November 2018.
Luszczek, P., J. Kurzak, I. Yamazaki, D. Keffer, V. Maroulas, and J. Dongarra, Autotuning Techniques for Performance-Portable Point Set Registration in 3D,” Supercomputing Frontiers and Innovations, vol. 5, no. 4, December 2018.  (720.15 KB)
Dongarra, J., I. Duff, M. Gates, A. Haidar, S. Hammarling, N. J. Higham, J. Hogg, P. Valero Lara, P. Luszczek, M. Zounon, et al., Batched BLAS (Basic Linear Algebra Subprograms) 2018 Specification , July 2018.  (483.05 KB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Batched One-Sided Factorizations of Tiny Matrices Using GPUs: Challenges and Countermeasures,” Journal of Computational Science, vol. 26, pp. 226–236, May 2018.  (3.73 MB)
Marques, O., J. Demmel, and P. B. Vasconcelos, Bidiagonal SVD Computation via an Associated Tridiagonal Eigenproblem,” LAPACK Working Note, no. LAWN 295, ICL-UT-18-02: University of Tennessee, April 2018.  (1.53 MB)

Pages