Publications

Export 836 results:
Submitted
Haidar, A., H. Jagode, P. Vaccaro, S. Tomov, and J. Dongarra, Investigating Power Capping toward Energy-Efficient Scientific Applications”, Concurrency and Computation: Practice and Experience (CCPE): Special Issue on Power-Aware Computing 2017, Submitted.
2017
Jagode, H., A. Danalis, and J. Dongarra, Accelerating NWChem Coupled Cluster through Dataflow-Based Execution”, The International Journal of High Performance Computing Applications, pp. 1–13, January 2017.  (4.07 MB)
Aupy, G., Y. Robert, and F. Vivien, Assuming failure independence: are we right to be wrong?”, 3rd International Workshop on Fault Tolerant Systems (FTS), Hawaii, US, IEEE, September 2017.  (597.11 KB)
Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Ortí, Batched Gauss-Jordan Elimination for Block-Jacobi Preconditioner Generation on GPUs”, Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, New York, NY, USA, ACM, pp. 1–10, February 2017.  (552.62 KB)
Faverge, M., J. Langou, Y. Robert, and J. Dongarra, Bidiagonalization and R-Bidiagonalization: Parallel Tiled Algorithms, Critical Paths and Distributed-Memory Implementation”, IEEE International Parallel and Distributed Processing Symposium (IPDPS), Orlando, FL, IEEE, May 2017.  (328.15 KB)
Gates, M., P. Luszczek, J. Kurzak, J. Dongarra, K. Arturov, C. Cecka, and C. Freitag, C++ API for BLAS and LAPACK”, SLATE Working Notes, no. 2, ICL-UT-17-03: Innovative Computing Laboratory, University of Tennessee, June 2017.  (721.12 KB)
Han, L., L-C. Canon, H. Casanova, Y. Robert, and F. Vivien, Checkpointing Workflows for Fail-Stop Errors”, IEEE Cluster, Hawaii, US, IEEE, September 2017.  (400.64 KB)
Yamazaki, I., M. Hoemmen, P. Luszczek, and J. Dongarra, Comparing performance of s-step and pipelined GMRES on distributed-memory multicore CPUs, , Pittsburgh, Pennsylvania, SIAM Annual Meeting, July 2017.  (748 KB)
Aupy, G., A. Benoit, L. Pottier, P. Raghavan, Y. Robert, and M. Shantharam, Co-Scheduling Algorithms for Cache-Partitioned Systems”, 19th Workshop on Advances in Parallel and Distributed Computational Models, Orlando, FL, IEEE Computer Society Press, May 2017.  (584.76 KB)
Jagode, H., Dataflow Programming Paradigms for Computational Chemistry Methods”, Innovative Computing Laboratory Technical Report, no. ICL-UT-17-01, Knoxville, TN, University of Tennessee, May 2017.
Kurzak, J., P. Wu, M. Gates, I. Yamazaki, P. Luszczek, G. Ragghianti, and J. Dongarra, Designing SLATE: Software for Linear Algebra Targeting Exascale”, SLATE Working Notes, no. 3, ICL-UT-17-06: Innovative Computing Laboratory, University of Tennessee, October 2017.  (2.8 MB)
Zhao, Y., L. Wan, W. Wu, G. Bosilca, R. Vuduc, J. Ye, W. Tang, and Z. Xu, Efficient Communications in Training Large Scale Neural Networks”, ACM MultiMedia Workshop 2017, Mountain View, CA, ACM, October 2017.  (1.41 MB)
M. Lopez, G., V. Larrea, W. Joubert, O. Hernandez, A. Haidar, S. Tomov, and J. Dongarra, Evaluation of Directive-based Performance Portable Programming Models”, International Journal of High Performance Computing and Networking (IJHPCN), vol. (In Press), 2017.
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Factorization and Inversion of a Million Matrices using GPUs: Challenges and Countermeasures”, Procedia Computer Science, vol. 108, pp. 606–615, June 2017.
Bosilca, G., A. Bouteiller, A. Guermouche, T. Herault, Y. Robert, P. Sens, and J. Dongarra, A Failure Detector for HPC Platforms”, The International Journal of High Performance Computing Applications, July 2017.  (1.04 MB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Fast Cholesky Factorization on GPUs for Batch and Native Modes in MAGMA”, Journal of Computational Science, vol. 20, pp. 85–93, May 2017.
Kabir, K., A. Haidar, S. Tomov, A. Bouteiller, and J. Dongarra, A Framework for Out of Memory SVD Algorithms”, ISC High Performance 2017, pp. pp. 158-178, June 2017.  (393.22 KB)
Benoit, A., F. Cappello, A. Cavelan, Y. Robert, and H. Sun, Identifying the Right Replication Level to Detect and Correct Silent Errors at Scale”, 2017 Workshop on Fault-Tolerance for HPC at Extreme Scale, Washington, DC, ACM, June 2017.  (865.68 KB)
Yamazaki, I., M. Hoemmen, P. Luszczek, and J. Dongarra, Improving performance of GMRES by reducing communication and pipelining global collectives”, Proceedings of The 18th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2017), Orlando, FL, June 2017.  (453.66 KB)
Anzt, H., E. Boman, J. Dongarra, G. Flegar, M. Gates, M. Heroux, M. Hoemmen, J. Kurzak, P. Luszczek, S. Rajamanickam, et al., MAGMA-sparse Interface Design Whitepaper”, Innovative Computing Laboratory Technical Report, no. ICL-UT-17-05, September 2017.  (1.28 MB)
Benoit, A., A. Cavelan, Y. Robert, and H. Sun, Multi-Level Checkpointing and Silent Error Detection for Linear Workflows”, Journal of Computational Science, April 2017.  (1.09 MB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Novel HPC Techniques to Batch Execution of Many Variable Size BLAS Computations on GPUs”, International Conference on Supercomputing (ICS '17), Chicago, Illinois, ACM, June 2017.
Benoit, A., A. Cavelan, V. Le Fèvre, and Y. Robert, Optimal Checkpointing Period with replicated execution on heterogeneous platforms”, 2017 Workshop on Fault-Tolerance for HPC at Extreme Scale, Washington, DC, IEEE Computer Society Press, June 2017.  (1.02 MB)
Dong, T., A. Haidar, S. Tomov, and J. Dongarra, Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices”, International Conference on Computational Science (ICCS 2017), Zurich, Switzerland, Procedia Computer Science, June 2017.  (364.95 KB)
Haidar, A., H. Jagode, A. YarKhan, P. Vaccaro, S. Tomov, and J. Dongarra, Power-aware Computing: Measurement, Control, and Performance Analysis for Intel Xeon Phi”, 2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Best Paper Finalist, Waltham, MA, IEEE, September 2017.  (908.84 KB)
Anzt, H., M. Gates, J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, Preconditioned Krylov solvers on GPUs”, Parallel Computing, May 2017.
Dongarra, J., Report on the TianHe-2A System”, Innovative Computing Laboratory Technical Report, no. ICL-UT-17-04: University of Tennessee, September 2017.  (7.15 MB)
Fang, A., A. Cavelan, Y. Robert, and A. Chien, Resilience for Stencil Computations with Latent Errors”, International Conference on Parallel Processing (ICPP), Bristol, UK, IEEE Computer Society Press, August 2017.  (1.19 MB)
Benoit, A., L. Pottier, and Y. Robert, Resilient Co-Scheduling of Malleable Applications”, International Journal of High Performance Computing Applications (IJHPCA), May 2017.  (1.62 MB)
Abdelfattah, A., H. Anzt, A. Bouteiller, A. Danalis, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, et al., Roadmap for the Development of a Linear Algebra Library for Exascale Computing: SLATE: Software for Linear Algebra Targeting Exascale”, SLATE Working Notes, no. 1, ICL-UT-17-02: Innovative Computing Laboratory, University of Tennessee, June 2017.  (2.44 MB)
Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, A. Haidar, I. Karlin, T. Kolev, I. Masliah, and S. Tomov, Small Tensor Operations on Advanced Architectures for High-order Applications”, University of Tennessee Computer Science Technical Report, no. UT-EECS-17-749: Innovative Computing Laboratory, University of Tennessee, April 2017.  (1.09 MB)
Baboulin, M., J. Dongarra, A. Remy, S. Tomov, and I. Yamazaki, Solving Dense Symmetric Indefinite Systems using GPUs”, Concurrency and Computation: Practice and Experience, vol. 29, issue 9, March 2017.  (1.94 MB)
Yamazaki, I., S. Nooshabadi, S. Tomov, and J. Dongarra, Structure-aware Linear Solver for Realtime Convex Optimization for Embedded Systems”, IEEE Embedded Systems Letters, vol. PP, issue 99, May 2017.  (339.11 KB)
Benoit, A., A. Cavelan, V. Le Fèvre, Y. Robert, and H. Sun, Towards Optimal Multi-Level Checkpointing”, IEEE Transactions on Computers, vol. 66, issue 7, pp. 1212–1226, July 2017.  (1.39 MB)
Eberius, D., T. Patinyasakdikul, and G. Bosilca, Using Software-Based Performance Counters to Expose Low-Level Open MPI Performance Information”, EuroMPI, Chicago, IL, ACM, September 2017.  (745.58 KB)
Anzt, H., J. Dongarra, G. Flegar, E. S. Quintana-Ortí, and A. E. Thomas, Variable-Size Batched Gauss-Huard for Block-Jacobi Preconditioning”, International Conference on Computational Science (ICCS 2017), vol. 108, Zurich, Switzerland, Procedia Computer Science, pp. 1783-1792, June 2017.
2016
Anzt, H., M. Baboulin, J. Dongarra, Y. Fournier, F. Hulsemann, A. Khabou, and Y. Wang, Accelerating the Conjugate Gradient Algorithm with GPU in CFD Simulations”, VECPAR, 2016.
Benoit, A., A. Cavelan, Y. Robert, and H. Sun, Assessing General-purpose Algorithms to Cope with Fail-stop and Silent Errors”, ACM Transactions on Parallel Computing, August 2016.  (573.71 KB)
Herrmann, J., G. Bosilca, T. Herault, L. Marchal, Y. Robert, and J. Dongarra, Assessing the Cost of Redistribution followed by a Computational Kernel: Complexity and Performance Results”, Parallel Computing, vol. 52, pp. 22-41, February 2016.  (2.06 MB)
Anzt, H., E. Chow, T. Huckle, and J. Dongarra, Batched Generation of Incomplete Sparse Approximate Inverses on GPUs”, Proceedings of the 7th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, pp. 49–56, November 2016.
Anzt, H., E. Chow, and J. Dongarra, On block-asynchronous execution on GPUs”, LAPACK Working Note, no. 291, November 2016.  (1.05 MB)
Bosilca, G., T. Herault, and J. Dongarra, Context Identifier Allocation in Open MPI”, University of Tennessee Computer Science Technical Report, no. ICL-UT-16-01: Innovative Computing Laboratory, University of Tennessee, January 2016.
Baboulin, M., J. Dongarra, A. Remy, S. Tomov, and I. Yamazaki, Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures”, Lecture Notes in Computer Science, vol. 9573: Springer International Publishing, pp. 86-95, September 2015, 2016.  (327.14 KB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, On the Development of Variable Size Batched Computation for Heterogeneous Parallel Architectures”, The 17th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2016), IPDPS 2016, Chicago, IL, IEEE, May 2016.  (708.62 KB)
Anzt, H., E. Chow, D. Szyld, and J. Dongarra, Domain Overlap for Iterative Sparse Triangular Solves on GPUs”, Software for Exascale Computing - SPPEXA, vol. 113: Springer International Publishing, pp. 527–545, September 2016.
Anzt, H., J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, Efficiency of General Krylov Methods on GPUs – An Experimental Study”, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 683-691, May 2016.
Anzt, H., J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, Efficiency of General Krylov Methods on GPUs – An Experimental Study”, The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), Chicago, IL, IEEE, May 2016.  (285.28 KB)
Bosilca, G., A. Bouteiller, A. Guermouche, T. Herault, Y. Robert, P. Sens, and J. Dongarra, Failure Detection and Propagation in HPC Systems”, Proceedings of the The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'16), Salt Lake City, Utah, IEEE Press, pp. 27:1-27:11, November 2016.
Anzt, H., , and E. S. Quintana-Ortí, Fine-grained Bit-Flip Protection for Relaxation Methods”, Journal of Computational Science, November 2016.
Wu, W., G. Bosilca, R. vandeVaart, S. Jeaugey, and J. Dongarra, GPU-Aware Non-contiguous Data Movement In Open MPI”, 25th International Symposium on High-Performance Parallel and Distributed Computing (HPDC'16), Kyoto, Japan, ACM, June 2016.  (482.32 KB)

Pages