Publications

Export 785 results:
Filters: Author is Jack Dongarra  [Clear All Filters]
Conference Paper
Wu, W., G. Bosilca, R. vandeVaart, S. Jeaugey, and J. Dongarra, GPU-Aware Non-contiguous Data Movement In Open MPI,” 25th International Symposium on High-Performance Parallel and Distributed Computing (HPDC'16), Kyoto, Japan, ACM, June 2016.  (482.32 KB)
Haidar, A., S. Tomov, J. Dongarra, and N. J. Higham, Harnessing GPU Tensor Cores for Fast FP16 Arithmetic to Speed up Mixed-Precision Iterative Refinement Solvers,” The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC18), Dallas, TX, IEEE, November 2018.
Jia, Y., P. Luszczek, and J. Dongarra, Hessenberg Reduction with Transient Error Resilience on GPU-Based Hybrid Architectures,” 30th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Chicago, IL, IEEE, May 2016.  (535.72 KB)
Haidar, A., P. Luszczek, S. Tomov, and J. Dongarra, Heterogeneous Acceleration for Linear Algebra in Mulit-Coprocessor Environments,” VECPAR 2014, Eugene, OR, June 2014.  (276.52 KB)
Newburn, C. J., G. Bansal, M. Wood, L. Crivelli, J. Planas, A. Duran, P. Souza, L. Borges, P. Luszczek, S. Tomov, et al., Heterogeneous Streaming,” The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2016, Chicago, IL, IEEE, May 2016.  (2.73 MB)
Wu, W., A. Bouteiller, G. Bosilca, M. Faverge, and J. Dongarra, Hierarchical DAG scheduling for Hybrid Distributed Systems,” 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Hyderabad, India, IEEE, May 2015.  (1.11 MB)
Haidar, A., A. Abdelfattah, S. Tomov, and J. Dongarra, High-performance Cholesky Factorization for GPU-only Execution,” Proceedings of the General Purpose GPUs (GPGPU-10), Austin, TX, ACM, February 2017.  (872.18 KB)
Masliah, I., A. Abdelfattah, A. Haidar, S. Tomov, J. Falcou, and J. Dongarra, High-performance Matrix-matrix Multiplications of Very Small Matrices,” 22nd International European Conference on Parallel and Distributed Computing (Euro-Par'16), Grenoble, France, Springer International Publishing, August 2016.
Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, et al., High-Performance Tensor Contractions for GPUs,” International Conference on Computational Science (ICCS'16), San Diego, CA, June 2016.  (2.36 MB)
Lukarski, D., H. Anzt, S. Tomov, and J. Dongarra, Hybrid Multi-Elimination ILU Preconditioners on GPUs,” International Heterogeneity in Computing Workshop (HCW), IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (1.67 MB)
Haidar, A., P. Luszczek, J. Kurzak, and J. Dongarra, An Improved Parallel Singular Value Algorithm and Its Implementation for Multicore Hardware,” Supercomputing 2013, Denver, CO, November 2013.
Yamazaki, I., H. Anzt, S. Tomov, M. Hoemmen, and J. Dongarra, Improving the performance of CA-GMRES on multicores with multiple GPUs,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (333.82 KB)
Haidar, A., P. Wu, S. Tomov, and J. Dongarra, Investigating Half Precision Arithmetic to Accelerate Dense Linear System Solvers,” ScalA17: 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Denver, CO, ACM, 11/2017.  (766.35 KB)
Anzt, H., E. Chow, and J. Dongarra, Iterative Sparse Triangular Solves for Preconditioning,” EuroPar 2015, Vienna, Austria, Springer Berlin, August 2015.  (322.36 KB)
Anzt, H., and J. Dongarra, A Jaccard Weights Kernel Leveraging Independent Thread Scheduling on GPUs,” SBAC-PAD, 2018.  (237.68 KB)
Dong, T., A. Haidar, P. Luszczek, J. Harris, S. Tomov, and J. Dongarra, LU Factorization of Small Matrices: Accelerating Batched DGETRF on the GPU,” 16th IEEE International Conference on High Performance Computing and Communications (HPCC), Paris, France, IEEE, August 2014.  (684.73 KB)
Haidar, A., S. Tomov, K. Arturov, M. Guney, S. Story, and J. Dongarra, LU, QR, and Cholesky Factorizations: Programming Model, Performance Analysis and Optimization Techniques for the Intel Knights Landing Xeon Phi,” IEEE High Performance Extreme Computing Conference (HPEC'16), Waltham, MA, IEEE, September 2016.  (943.23 KB)
Haidar, A., S. Tomov, P. Luszczek, and J. Dongarra, MAGMA Embedded: Towards a Dense Linear Algebra Library for Energy Efficient Extreme Computing,” 2015 IEEE High Performance Extreme Computing Conference (HPEC ’15), (Best Paper Award), Waltham, MA, IEEE, September 2015.  (678.86 KB)
Bai, Z., J. Dongarra, D. Lu, and I. Yamazaki, Matrix Powers Kernels for Thick-Restart Lanczos with Explicit External Deflation,” International Parallel and Distributed Processing Symposium (IPDPS), May 2019.
Marin, G., J. Dongarra, and D. Terpstra, MIAMI: A Framework for Application Performance Diagnosis ,” IPASS-2014, Monterey, CA, IEEE, March 2014.  (1010.75 KB)
Yamazaki, I., S. Tomov, J. Kurzak, J. Dongarra, and J. Barlow, Mixed-precision Block Gram Schmidt Orthogonalization,” 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Austin, TX, ACM, November 2015.  (235.69 KB)
Yamazaki, I., J. Barlow, S. Tomov, J. Kurzak, and J. Dongarra, Mixed-precision orthogonalization process Performance on multicore CPUs with GPUs,” 2015 SIAM Conference on Applied Linear Algebra, Atlanta, GA, SIAM, October 2015.  (301.01 KB)
Yamazaki, I., S. Tomov, T. Dong, and J. Dongarra, Mixed-precision orthogonalization scheme and adaptive step size for CA-GMRES on GPUs,” VECPAR 2014 (Best Paper), Eugene, OR, June 2014.  (438.54 KB)
Bouteiller, A., F. Cappello, J. Dongarra, A. Guermouche, T. Herault, and Y. Robert, Multi-criteria Checkpointing Strategies: Response-Time versus Resource Utilization,” Euro-Par 2013, Aachen, Germany, Springer, August 2013.  (431.84 KB)
Haidar, A., P. Luszczek, and J. Dongarra, New Algorithm for Computing Eigenvectors of the Symmetric Eigenvalue Problem,” Workshop on Parallel and Distributed Scientific and Engineering Computing, IPDPS 2014 (Best Paper), Phoenix, AZ, IEEE, May 2014.  (2.33 MB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Novel HPC Techniques to Batch Execution of Many Variable Size BLAS Computations on GPUs,” International Conference on Supercomputing (ICS '17), Chicago, Illinois, ACM, June 2017.
Herault, T., Y. Robert, A. Bouteiller, D. Arnold, K. Ferreira, G. Bosilca, and J. Dongarra, Optimal Cooperative Checkpointing for Shared High-Performance Computing Platforms,” 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Best Paper Award, Vancouver, BC, Canada, IEEE, May 2018.  (899.3 KB)
Haidar, A., T. Dong, P. Luszczek, S. Tomov, and J. Dongarra, Optimization for Performance and Energy for Batched Matrix Computations on GPUs,” 8th Workshop on General Purpose Processing Using GPUs (GPGPU 8), San Francisco, CA, ACM, February 2015.  (699.5 KB)
Dongarra, J., S. Hammarling, N. Higham, S. Relton, and M. Zounon, Optimized Batched Linear Algebra for Modern Architectures,” Euro-Par 2017, Santiago de Compostela, Spain, Springer, August 2017.
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Optimizing GPU Kernels for Irregular Batch Workloads: A Case Study for Cholesky Factorization,” IEEE High Performance Extreme Computing Conference (HPEC’18), Waltham, MA, IEEE, September 2018.  (729.87 KB)
Tomov, S., P. Luszczek, I. Yamazaki, J. Dongarra, H. Anzt, and W. Sawyer, Optimizing Krylov Subspace Solvers on Graphics Processing Units,” Fourth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (536.32 KB)
Dong, T., A. Haidar, S. Tomov, and J. Dongarra, Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices,” International Conference on Computational Science (ICCS 2017), Zurich, Switzerland, Procedia Computer Science, June 2017.  (364.95 KB)
Haidar, A., K. Kabir, D. Fayad, S. Tomov, and J. Dongarra, Out of Memory SVD Solver for Big Data,” 2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Waltham, MA, IEEE, September 2017.  (1.33 MB)
Jia, Y., G. Bosilca, P. Luszczek, and J. Dongarra, Parallel Reduction to Hessenberg Form with Algorithm-Based Fault Tolerance,” International Conference for High Performance Computing, Networking, Storage and Analysis, IEEE-SC 2013, Denver, CO, November 2013.  (147.09 KB)
Danalis, A., H. Jagode, G. Bosilca, and J. Dongarra, PaRSEC in Practice: Optimizing a Legacy Chemistry Application through Distributed Task-Based Execution,” 2015 IEEE International Conference on Cluster Computing, Chicago, IL, IEEE, September 2015.  (1.77 MB)
Haidar, A., B. Brock, S. Tomov, M. Guidry, J. Jay Billings, D. Shyles, and J. Dongarra, Performance Analysis and Acceleration of Explicit Integration for Large Kinetic Networks using Batched GPU Computations,” 2016 IEEE High Performance Extreme Computing Conference (HPEC ‘16), Waltham, MA, IEEE, September 2016.  (480.29 KB)
Kabir, K., A. Haidar, S. Tomov, and J. Dongarra, Performance Analysis and Design of a Hessenberg Reduction using Stabilized Blocked Elementary Transformations for New Architectures,” The Spring Simulation Multi-Conference 2015 (SpringSim'15), Best Paper Award, Alexandria, VA, April 2015.  (608.44 KB)
Kabir, K., A. Haidar, S. Tomov, and J. Dongarra, Performance Analysis and Optimization of Two-Sided Factorization Algorithms for Heterogeneous Platform,” International Conference on Computational Science (ICCS 2015), Reykjavík, Iceland, June 2015.  (1.12 MB)
Haidar, A., C. Cao, I. Yamazaki, J. Dongarra, M. Gates, P. Luszczek, and S. Tomov, Performance and Portability with OpenCL for Throughput-Oriented HPC Workloads Across Accelerators, Coprocessors, and Multicore Processors,” 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA '14), New Orleans, LA, IEEE, November 2014.  (407.5 KB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Performance, Design, and Autotuning of Batched GEMM for GPUs,” The International Supercomputing Conference (ISC High Performance 2016), Frankfurt, Germany, June 2016.  (1.27 MB)
Mary, T., I. Yamazaki, J. Kurzak, P. Luszczek, S. Tomov, and J. Dongarra, Performance of Random Sampling for Computing Low-rank Approximations of a Dense Matrix on GPUs,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Performance Tuning and Optimization Techniques of Fixed and Variable Size Batched Cholesky Factorization on GPUs,” International Conference on Computational Science (ICCS'16), San Diego, CA, June 2016.  (626.21 KB)
Bouteiller, A., G. Bosilca, and J. Dongarra, Plan B: Interruption of Ongoing MPI Operations to Support Failure Recovery,” 22nd European MPI Users' Group Meeting, Bordeaux, France, ACM, September 2015.  (543.32 KB)
Dongarra, J., M. Gates, A. Haidar, Y. Jia, K. Kabir, P. Luszczek, and S. Tomov, Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi,” PPAM 2013, Warsaw, Poland, September 2013.  (284.97 KB)
McCraw, H., J. Ralph, A. Danalis, and J. Dongarra, Power Monitoring with PAPI for Extreme Scale Architectures and Dataflow-based Programming Models,” 2014 IEEE International Conference on Cluster Computing, no. ICL-UT-14-04, Madrid, Spain, IEEE, September 2014.  (3.45 MB)
Haidar, A., H. Jagode, A. YarKhan, P. Vaccaro, S. Tomov, and J. Dongarra, Power-aware Computing: Measurement, Control, and Performance Analysis for Intel Xeon Phi,” 2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Best Paper Finalist, Waltham, MA, IEEE, September 2017.  (908.84 KB)
Herault, T., A. Bouteiller, G. Bosilca, M. Gamell, K. Teranishi, M. Parashar, and J. Dongarra, Practical Scalable Consensus for Pseudo-Synchronous Distributed Systems,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.  (550.96 KB)
Danalis, A., G. Bosilca, A. Bouteiller, T. Herault, and J. Dongarra, PTG: An Abstraction for Unhindered Parallelism,” International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC), New Orleans, LA, IEEE Press, November 2014.  (480.05 KB)
Yamazaki, I., J. Kurzak, P. Luszczek, and J. Dongarra, Randomized Algorithms to Update Partial Singular Value Decomposition on a Hybrid CPU/GPU Cluster,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.
Anzt, H., E. Chow, D. Szyld, and J. Dongarra, Random-Order Alternating Schwarz for Sparse Triangular Solves,” 2015 SIAM Conference on Applied Linear Algebra (SIAM LA), Atlanta, GA, SIAM, October 2015.  (1.53 MB)

Pages