Publications

Export 948 results:
2018
Kurzak, J., M. Gates, I. Yamazaki, A. Charara, A. YarKhan, J. Finney, G. Ragghianti, P. Luszczek, and J. Dongarra, Linear Systems Performance Report,” SLATE Working Notes, no. 8, ICL-UT-18-08: Innovative Computing Laboratory, University of Tennessee, September 2018.  (1.64 MB)
Benoit, A., A. Cavelan, Y. Robert, and H. Sun, Multi-Level Checkpointing and Silent Error Detection for Linear Workflows,” Journal of Computational Science, vol. 28, pp. 398–415, September 2018.
Herault, T., Y. Robert, A. Bouteiller, D. Arnold, K. Ferreira, G. Bosilca, and J. Dongarra, Optimal Cooperative Checkpointing for Shared High-Performance Computing Platforms,” 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Best Paper Award, Vancouver, BC, Canada, IEEE, May 2018.  (899.3 KB)
Anzt, H., M. Kreutzer, E. Ponce, G. D. Peterson, G. Wellein, and J. Dongarra, Optimization and Performance Evaluation of the IDR Iterative Krylov Solver on GPUs,” The International Journal of High Performance Computing Applications, vol. 32, no. 2, pp. 220–230, March 2018.  (2.08 MB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Optimizing GPU Kernels for Irregular Batch Workloads: A Case Study for Cholesky Factorization,” IEEE High Performance Extreme Computing Conference (HPEC’18), Waltham, MA, IEEE, September 2018.  (729.87 KB)
Kurzak, J., M. Gates, A. YarKhan, I. Yamazaki, P. Wu, P. Luszczek, J. Finney, and J. Dongarra, Parallel BLAS Performance Report,” SLATE Working Notes, no. 5, ICL-UT-18-01: University of Tennessee, April 2018.  (4.39 MB)
Kurzak, J., M. Gates, A. YarKhan, I. Yamazaki, P. Luszczek, J. Finney, and J. Dongarra, Parallel Norms Performance Report,” SLATE Working Notes, no. 6, ICL-UT-18-06: Innovative Computing Laboratory, University of Tennessee, June 2018.  (1.13 MB)
Anzt, H., E. Chow, and J. Dongarra, ParILUT - A New Parallel Threshold ILU,” SIAM Journal on Scientific Computing, vol. 40, issue 4: SIAM, pp. C503–C519, July 2018.  (19.26 MB)
Benoit, A., S. Perarnau, L. Pottier, and Y. Robert, A Performance Model to Execute Workflows on High-Bandwidth Memory Architectures,” The 47th International Conference on Parallel Processing (ICPP 2018), Eugene, OR, IEEE Computer Society Press, August 2018.  (868.44 KB)
Castain, R., J. Hursey, A. Bouteiller, and D. Solt, PMIx: Process Management for Exascale Environments,” Parallel Computing, vol. 79, pp. 9–29, January 2018.
Hoemmen, M., and I. Yamazaki, Production Implementations of Pipelined & Communication-Avoiding Iterative Linear Solvers , Tokyo, Japan, SIAM Conference on Parallel Processing for Scientific Computing, March 2018.  (2.34 MB)
Aupy, G., and Y. Robert, Scheduling for Fault-Tolerance: An Introduction,” Topics in Parallel and Distributed Computing: Springer International Publishing, pp. 143–170, 2018.
Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, The Singular Value Decomposition: Anatomy of Optimizing an Algorithm for Extreme Scale,” SIAM Review, vol. 60, issue 4, pp. 808–865, November 2018.
Jagode, H., A. Danalis, H. Anzt, I. Yamazaki, M. Hoemmen, E. Boman, S. Tomov, and J. Dongarra, Software-Defined Events (SDEs) in MAGMA-Sparse,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-12: University of Tennessee, December 2018.  (481.69 KB)
Anzt, H., I. Yamazaki, M. Hoemmen, E. Boman, and J. Dongarra, Solver Interface & Performance on Cori,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-05: University of Tennessee, June 2018.  (188.05 KB)
Zaitsev, D., S. Tomov, and J. Dongarra, Solving Linear Diophantine Systems on Parallel Architectures,” IEEE Transactions on Parallel and Distributed Systems, October 2018.
Bernholdt, D. E., S. Boehm, G. Bosilca, M. Grentla Venkata, R. E. Grant, T. Naughton, H. P. Pritchard, M. Schulz, and G. R. Vallee, A Survey of MPI Usage in the US Exascale Computing Project,” Concurrency Computation: Practice and Experience, September 2018.  (359.54 KB)
Yamazaki, I., J. Kurzak, P. Wu, M. Zounon, and J. Dongarra, Symmetric Indefinite Linear Solver using OpenMP Task on Multicore Architectures,” IEEE Transactions on Parallel and Distributed Systems, vol. 29, issue 8, pp. 1879–1892, August 2018.  (2.88 MB)
Bosilca, G., D. Genet, R. Harrison, T. Herault, M. Mahdi Javanmard, C. Peng, and E. Valeev, Tensor Contraction on Distributed Hybrid Architectures using a Task-Based Runtime System,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-13: University of Tennessee, December 2018.  (326.11 KB)
Haidar, A., S. Tomov, A. Abdelfattah, M. Zounon, and J. Dongarra, Using GPU FP16 Tensor Cores Arithmetic to Accelerate Mixed-Precision Iterative Refinement Solvers and Reduce Energy Consumption,” ISC High Performance (ISC'18), Best Poster, Frankfurt, Germany, June 2018.  (3.01 MB)
Chow, E., H. Anzt, J. Scott, and J. Dongarra, Using Jacobi Iterations and Blocking for Solving Sparse Triangular Systems in Incomplete Factorization Preconditioning,” Journal of Parallel and Distributed Computing, vol. 119, pp. 219–230, November 2018.  (273.53 KB)
Anzt, H., J. Dongarra, G. Flegar, and T. Gruetzmacher, Variable-Size Batched Condition Number Calculation on GPUs,” SBAC-PAD, Lyon, France, September 2018.  (509.3 KB)
Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Ortí, Variable-Size Batched Gauss–Jordan Elimination for Block-Jacobi Preconditioning on Graphics Processors,” Parallel Computing, January 2018.  (1.9 MB)
2019
Anzt, H., J. Dongarra, G. Flegar, N. J. Higham, and E. S. Quintana-Orti, Adaptive precision in block-Jacobi preconditioning for iterative sparse linear system solvers,” Concurrency and Computation: Practice and Experience, vol. 31, no. 6, pp. e4460, 2019.  (341.54 KB)
Masliah, I., A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, and J. Dongarra, Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,” Parallel Computing, vol. 81, pp. 1–21, January 2019.
Herault, T., Y. Robert, A. Bouteiller, D. Arnold, K. Ferreira, G. Bosilca, and J. Dongarra, Checkpointing Strategies for Shared High-Performance Computing Platforms,” International Journal of Networking and Computing, vol. 9, no. 1, pp. 28–52, 2019.
Benoit, A., A. Cavelan, F. M. Ciorba, V. Le Fèvre, and Y. Robert, Combining checkpointing and replication for reliable execution of linear workflows with fail-stop and silent errors,” International Journal of Networking and Computing, vol. 9, no. 1, pp. 2-27, 2019.  (754.6 KB)
Le Fèvre, V., T. Herault, Y. Robert, A. Bouteiller, A. Hori, G. Bosilca, and J. Dongarra, Comparing the Performance of Rigid, Moldable, and Grid-Shaped Applications on Failure-Prone HPC Platforms,” Parallel Computing, vol. 85, pp. 1–12, July 2019.  (865.18 KB)
Kaya, O., and Y. Robert, Computing dense tensor decompositions with optimal dimension trees,” Algorithmica, to appear, 2019.  (638.4 KB)
Aupy, G., A. Benoit, B. Goglin, L. Pottier, and Y. Robert, Co-scheduling HPC workloads on cache-partitioned CMP platforms,” Int. Journal of High Performance Computing Applications, To appear, 2019.  (930.28 KB)
Bosilca, G., A. Bouteiller, T. Herault, V. Le Fèvre, Y. Robert, and J. Dongarra, Distributed Termination Detection for HPC Task-Based Environments,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-14: University of Tennessee, June 2018, 2019.
M. Lopez, G., W. Joubert, V. Larrea, O. Hernandez, A. Haidar, S. Tomov, and J. Dongarra, Evaluation of Directive-Based Performance Portable Programming Models,” International Journal of High Performance Computing and Networking (to appear), 2019.
Han, L., V. Le Fèvre, L-C. Canon, Y. Robert, and F. Vivien, A Generic Approach to Scheduling and Checkpointing Workflows,” Int. Journal of High Performance Computing Applications, To appear, 2019.  (555.01 KB)
Beck, M., T. Moore, and P. Luszczek, Interoperable Convergence of Storage, Networking, and Computation,” Future of Information and Communication Conference (FICC), San Francisco, Science and Information (SAI), March 2019.  (1.8 MB)
Beck, M., T. Moore, P. Luszczek, and A. Danalis, Interoperable Convergence of Storage, Networking, and Computation,” FICC 2019, San Francisco, CA, Springer, March 14-15, 2019.  (2.64 MB)
Losada, N., G. Bosilca, A. Bouteiller, P. González, and M. J. Martín, Local Rollback for Resilient MPI Applications with Application-Level Checkpointing and Message Logging,” Future Generation Computer Systems, vol. 91, pp. 450-464, February 2019.  (1.16 MB)
Bai, Z., J. Dongarra, D. Lu, and I. Yamazaki, Matrix Powers Kernels for Thick-Restart Lanczos with Explicit External Deflation,” International Parallel and Distributed Processing Symposium (IPDPS), May 2019.
Yamazaki, I., E. Chow, A. Bouteiller, and J. Dongarra, Performance of Asynchronous Optimized Schwarz with One-sided Communication,” Parallel Computing, vol. 86, pp. 66–81, August 2019.
Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, P. Wu, I. Yamazaki, A. YarKhan, M. Abalenkovs, N. Bagherpour, et al., PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP,” ACM Transactions on Mathematical Software (to appear), 2019.  (7.5 MB)
Aupy, G., A. Gainaru, V. Honoré, P. Raghavan, Y. Robert, and H. Sun, Reservation strategies for stochastic jobs,” IPDPS'2019, the 33st IEEE International Parallel and Distributed Processing Symposium: IEEE Computer Society Press, 2019.  (808.93 KB)
Canon, L-C., A. Kong Win Chang, Y. Robert, and F. Vivien, Scheduling independent stochastic tasks under deadline and budget constraints,” Int. Journal of High Performance Computing Applications, vol. To appear, 2019.  (427.92 KB)
Charara, A., M. Gates, J. Kurzak, and J. Dongarra, SLATE Developers' Guide,” SLATE Working Notes, no. 11, ICL-UT-19-02: Innovative Computing Laboratory, University of Tennessee, January 2019.
Charara, A., J. Dongarra, M. Gates, J. Kurzak, and A. YarKhan, SLATE Mixed Precision Performance Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-03: University of Tennessee, April 2019.  (1.04 MB)
Gates, M., A. Charara, J. Kurzak, and J. Dongarra, SLATE Users' Guide,” SLATE Working Notes, no. 10, ICL-UT-19-01: Innovative Computing Laboratory, University of Tennessee, January 2019.
Kurzak, J., M. Gates, A. Charara, A. YarKhan, and J. Dongarra, SLATE Working Note 12: Implementing Matrix Inversions,” SLATE Working Notes, no. 12, ICL-UT-19-04: Innovative Computing Laboratory, University of Tennessee, June 2019.  (1.61 MB)
Hori, A., Y. Tsujita, A. Shimada, K. Yoshinaga, N. Mitaro, G. Fukazawa, M. Sato, G. Bosilca, A. Bouteiller, and T. Herault, System Software for Many-Core and Multi-core Architecture,” Advanced Software Technologies for Post-Peta Scale Computing: The Japanese Post-Peta CREST Research Project, Singapore, Springer Singapore, pp. 59–75, 2019.
Anzt, H., G. Flegar, T. Grützmacher, and E. S. Quintana-Ortí, Toward a modular precision ecosystem for high-performance computing,” The International Journal of High Performance Computing Applications, pp. 109434201984654, Sep-May 2019.  (1.93 MB)
Anzt, H., Y-C. Chen, T. Cojean, J. Dongarra, G. Flegar, P. Nayak, E. S. Quintana-Orti, Y. M. Tsai, W. Wang, and , Towards Continuous Benchmarking,” the Platform for Advanced Scientific Computing ConferenceProceedings of the Platform for Advanced Scientific Computing Conference on - PASC '19, Zurich, SwitzerlandNew York, New York, USA, ACM Press, 2019.  (1.51 MB)

Pages