Publications

Export 837 results:
Filters: Author is Jack Dongarra  [Clear All Filters]
2015
Anzt, H., B. Haugen, J. Kurzak, P. Luszczek, and J. Dongarra, Experiences in Autotuning Matrix Multiplication for Energy Minimization on GPUs,” Concurrency and Computation: Practice and Experience, vol. 27, issue 17, pp. 5096 - 5113, Oct 12, 2015. DOI: 10.1002/cpe.3516  (1.99 MB)
Anzt, H., B. Haugen, J. Kurzak, P. Luszczek, and J. Dongarra, Experiences in autotuning matrix multiplication for energy minimization on GPUs,” Concurrency in Computation: Practice and Experience, vol. 27, issue 17, pp. 5096-5113, December 2015. DOI: 10.1002/cpe.3516  (1.98 MB)
Dongarra, J., T. Herault, and Y. Robert, Fault Tolerance Techniques for High-performance Computing,” University of Tennessee Computer Science Technical Report (also LAWN 289), no. UT-EECS-15-734: University of Tennessee, May 2015.
Haidar, A., A. YarKhan, C. Cao, P. Luszczek, S. Tomov, and J. Dongarra, Flexible Linear Algebra Development and Scheduling with Cholesky Factorization,” 17th IEEE International Conference on High Performance Computing and Communications, Newark, NJ, August 2015.  (494.31 KB)
Haidar, A., T. Dong, S. Tomov, P. Luszczek, and J. Dongarra, Framework for Batched and GPU-resident Factorization Algorithms to Block Householder Transformations,” ISC High Performance, Frankfurt, Germany, Springer, July 2015.  (778.26 KB)
Anzt, H., E. Ponce, G. D. Peterson, and J. Dongarra, GPU-accelerated Co-design of Induced Dimension Reduction: Algorithmic Fusion and Kernel Overlap,” 2nd International Workshop on Hardware-Software Co-Design for High Performance Computing, Austin, TX, ACM, November 2015.  (1.46 MB)
Wu, W., A. Bouteiller, G. Bosilca, M. Faverge, and J. Dongarra, Hierarchical DAG scheduling for Hybrid Distributed Systems,” 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Hyderabad, India, IEEE, May 2015.  (1.11 MB)
Dongarra, J., N. J. Higham, M. R. Dennis, P. Glendinning, P. A. Martin, F. Santosa, and J. Tanner, High-Performance Computing,” The Princeton Companion to Applied Mathematics, Princeton, New Jersey, Princeton University Press, pp. 839-842, 2015.
Dongarra, J., M. A. Heroux, and P. Luszczek, High-Performance Conjugate-Gradient Benchmark: A New Metric for Ranking High-Performance Computing Systems,” The International Journal of High Performance Computing Applications, 2015. DOI: 10.1177/1094342015593158  (336.19 KB)
Haidar, A., J. Dongarra, K. Kabir, M. Gates, P. Luszczek, S. Tomov, and Y. Jia, HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi,” Scientific Programming, vol. 23, issue 1, January 2015. DOI: 10.3233/SPR-140404  (553.94 KB)
Dongarra, J., M. A. Heroux, and P. Luszczek, HPCG Benchmark: a New Metric for Ranking High Performance Computing Systems,” University of Tennessee Computer Science Technical Report , no. ut-eecs-15-736: University of Tennessee, January 2015.
Kurzak, J., H. Anzt, M. Gates, and J. Dongarra, Implementation and Tuning of Batched Cholesky Factorization and Solve for NVIDIA GPUs,” IEEE Transactions on Parallel and Distributed Systems, no. 1045-9219, November 2015.
Anzt, H., E. Chow, and J. Dongarra, Iterative Sparse Triangular Solves for Preconditioning,” EuroPar 2015, Vienna, Austria, Springer Berlin, August 2015. DOI: 10.1007/978-3-662-48096-0_50  (322.36 KB)
Haidar, A., S. Tomov, P. Luszczek, and J. Dongarra, MAGMA Embedded: Towards a Dense Linear Algebra Library for Energy Efficient Extreme Computing,” 2015 IEEE High Performance Extreme Computing Conference (HPEC ’15), (Best Paper Award), Waltham, MA, IEEE, September 2015.  (678.86 KB)
Anzt, H., J. Dongarra, M. Gates, A. Haidar, K. Kabir, P. Luszczek, S. Tomov, and I. Yamazaki, MAGMA MIC: Optimizing Linear Algebra for Intel Xeon Phi , Frankfurt, Germany, ISC High Performance (ISC15), Intel Booth Presentation, June 2015.  (2.03 MB)
Yamazaki, I., S. Tomov, J. Kurzak, J. Dongarra, and J. Barlow, Mixed-precision Block Gram Schmidt Orthogonalization,” 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Austin, TX, ACM, November 2015.  (235.69 KB)
Yamazaki, I., S. Tomov, and J. Dongarra, Mixed-Precision Cholesky QR Factorization and its Case Studies on Multicore CPU with Multiple GPUs,” SIAM Journal on Scientific Computing, vol. 37, no. 3, pp. C203-C330, May 2015. DOI: DOI:10.1137/14M0973773  (374.8 KB)
Yamazaki, I., J. Barlow, S. Tomov, J. Kurzak, and J. Dongarra, Mixed-precision orthogonalization process Performance on multicore CPUs with GPUs,” 2015 SIAM Conference on Applied Linear Algebra, Atlanta, GA, SIAM, October 2015.  (301.01 KB)
Faverge, M., J. Herrmann, J. Langou, B. Lowery, Y. Robert, and J. Dongarra, Mixing LU-QR Factorization Algorithms to Design High-Performance Dense Linear Algebra Solvers,” Journal of Parallel and Distributed Computing, vol. 85, pp. 32-46, November 2015. DOI: doi:10.1016/j.jpdc.2015.06.007  (5.06 MB)
Haidar, A., T. Dong, P. Luszczek, S. Tomov, and J. Dongarra, Optimization for Performance and Energy for Batched Matrix Computations on GPUs,” 8th Workshop on General Purpose Processing Using GPUs (GPGPU 8), San Francisco, CA, ACM, February 2015. DOI: 10.1145/2716282.2716288  (699.5 KB)
Abalenkovs, M., A. Abdelfattah, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, I. Yamazaki, and A. YarKhan, Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems,” Supercomputing Frontiers and Innovations, vol. 2, no. 4, October 2015. DOI: 10.14529/jsfi1504  (3.68 MB)
Danalis, A., H. Jagode, G. Bosilca, and J. Dongarra, PaRSEC in Practice: Optimizing a Legacy Chemistry Application through Distributed Task-Based Execution,” 2015 IEEE International Conference on Cluster Computing, Chicago, IL, IEEE, September 2015.  (1.77 MB)
Kabir, K., A. Haidar, S. Tomov, and J. Dongarra, Performance Analysis and Design of a Hessenberg Reduction using Stabilized Blocked Elementary Transformations for New Architectures,” The Spring Simulation Multi-Conference 2015 (SpringSim'15), Best Paper Award, Alexandria, VA, April 2015.  (608.44 KB)
Kabir, K., A. Haidar, S. Tomov, and J. Dongarra, Performance Analysis and Optimization of Two-Sided Factorization Algorithms for Heterogeneous Platform,” International Conference on Computational Science (ICCS 2015), Reykjavík, Iceland, June 2015.  (1.12 MB)
Mary, T., I. Yamazaki, J. Kurzak, P. Luszczek, S. Tomov, and J. Dongarra, Performance of Random Sampling for Computing Low-rank Approximations of a Dense Matrix on GPUs,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.
Bouteiller, A., G. Bosilca, and J. Dongarra, Plan B: Interruption of Ongoing MPI Operations to Support Failure Recovery,” 22nd European MPI Users' Group Meeting, Bordeaux, France, ACM, September 2015. DOI: 10.1145/2802658.2802668  (543.32 KB)
Herault, T., A. Bouteiller, G. Bosilca, M. Gamell, K. Teranishi, M. Parashar, and J. Dongarra, Practical Scalable Consensus for Pseudo-Synchronous Distributed Systems: Formal Proof,” Innovative Computing Laboratory Technical Report, no. ICL-UT-15-01, April 2015.  (570.97 KB)
Herault, T., A. Bouteiller, G. Bosilca, M. Gamell, K. Teranishi, M. Parashar, and J. Dongarra, Practical Scalable Consensus for Pseudo-Synchronous Distributed Systems,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.  (550.96 KB)
Yamazaki, I., J. Kurzak, P. Luszczek, and J. Dongarra, Randomized Algorithms to Update Partial Singular Value Decomposition on a Hybrid CPU/GPU Cluster,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.
Anzt, H., E. Chow, D. Szyld, and J. Dongarra, Random-Order Alternating Schwarz for Sparse Triangular Solves,” 2015 SIAM Conference on Applied Linear Algebra (SIAM LA), Atlanta, GA, SIAM, October 2015.  (1.53 MB)
Song, F., and J. Dongarra, A Scalable Approach to Solving Dense Linear Algebra Problems on Hybrid CPU-GPU Systems,” Concurrency and Computation: Practice and Experience, vol. 27, issue 14, pp. 3702-3723, September 2015. DOI: 10.1002/cpe.3403  (8.16 MB)
Donfack, S., J. Dongarra, M. Faverge, M. Gates, J. Kurzak, P. Luszczek, and I. Yamazaki, A Survey of Recent Developments in Parallel Implementations of Gaussian Elimination,” Concurrency and Computation: Practice and Experience, vol. 27, issue 5, pp. 1292-1309, April 2015. DOI: 10.1002/cpe.3306  (783.45 KB)
Strohmaier, E., H. Meuer, J. Dongarra, and H. D. Simon, The TOP500 List and Progress in High-Performance Computing,” IEEE Computer, vol. 48, issue 11, pp. 42-49, November 2015. DOI: doi:10.1109/MC.2015.338
Baboulin, M., V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, and S. Tomov, Towards a High-Performance Tensor Algebra Package for Accelerators , Gatlinburg, TN, moky Mountains Computational Sciences and Engineering Conference (SMC15), September 2015.  (1.76 MB)
Haidar, A., P. Luszczek, S. Tomov, and J. Dongarra, Towards Batched Linear Solvers on Accelerated Hardware Platforms,” 8th Workshop on General Purpose Processing Using GPUs (GPGPU 8) co-located with PPOPP 2015, San Francisco, CA, ACM, February 2015.  (403.74 KB)
Anzt, H., J. Dongarra, and E. S. Quintana-Ortí, Tuning Stationary Iterative Solvers for Fault Resilience,” 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA15), Austin, TX, ACM, November 2015.  (1.28 MB)
Haugen, B., S. Richmond, J. Kurzak, C. A. Steed, and J. Dongarra, Visualizing Execution Traces with Task Dependencies,” 2nd Workshop on Visual Performance Analysis (VPA '15), Austin, TX, ACM, November 2015.  (927.5 KB)
Haidar, A., Y. Jia, P. Luszczek, S. Tomov, A. YarKhan, and J. Dongarra, Weighted Dynamic Scheduling with Many Parallelism Grains for Offloading of Numerical Workloads to Multiple Varied Accelerators,” Proceedings of the 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA'15), vol. No. 5, Austin, TX, ACM, November 2015.  (347.6 KB)
2014
Gates, M., A. Haidar, and J. Dongarra, Accelerating Eigenvector Computation in the Nonsymmetric Eigenvalue Problem,” VECPAR 2014, Eugene, OR, June 2014.  (199.44 KB)
Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, Accelerating Numerical Dense Linear Algebra Calculations with GPUs,” Numerical Computations with GPUs: Springer International Publishing, pp. 3-28, 2014. DOI: 10.1007/978-3-319-06548-9_1  (1.06 MB)
Anzt, H., S. Tomov, and J. Dongarra, Accelerating the LOBPCG method on GPUs using a blocked Sparse Matrix Vector Product,” University of Tennessee Computer Science Technical Report, no. UT-EECS-14-731: University of Tennessee, October 2014.  (1.83 MB)
Yamazaki, I., T. Mary, J. Kurzak, S. Tomov, and J. Dongarra, Access-averse Framework for Computing Low-rank Matrix Approximations,” First International Workshop on High Performance Big Graph Data Management, Analysis, and Mining, Washington, DC, October 2014.
Dongarra, J., M. Faverge, H. Ltaeif, and P. Luszczek, Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting,” Concurrency and Computation: Practice and Experience, vol. 26, issue 7, pp. 1408-1431, May 2014. DOI: 10.1002/cpe.3110  (1.96 MB)
Bosilca, G., A. Bouteiller, T. Herault, Y. Robert, and J. Dongarra, Assessing the Impact of ABFT and Checkpoint Composite Strategies,” 16th Workshop on Advances in Parallel and Distributed Computational Models, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (1.02 MB)
Cao, C., J. Dongarra, P. Du, M. Gates, P. Luszczek, and S. Tomov, clMAGMA: High Performance Dense Linear Algebra with OpenCL ,” International Workshop on OpenCL, Bristol University, England, May 2014.  (460.91 KB)
Ballard, G., D. Becker, J. Demmel, J. Dongarra, A. Druinsky, I. Peled, O. Schwartz, S. Toledo, and I. Yamazaki, Communication-Avoiding Symmetric-Indefinite Factorization,” SIAM Journal on Matrix Analysis and Application, vol. 35, issue 4, pp. 1364-1406, July 2014.  (593.18 KB)
Baboulin, M., J. Dongarra, and R. Lacroix, Computing Least Squares Condition Numbers on Hybrid Multicore/GPU Systems,” International Interdisciplinary Conference on Applied Mathematics, Modeling and Computational Science (AMMCS), Waterloo, Ontario, CA, August 2014.  (130.18 KB)
Yamazaki, I., S. Tomov, and J. Dongarra, Deflation Strategies to Improve the Convergence of Communication-Avoiding GMRES,” 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, New Orleans, LA, November 2014.  (465.52 KB)
Yamazaki, I., J. Kurzak, P. Luszczek, and J. Dongarra, Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime,” Workshop on Large-Scale Parallel Processing, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (398.16 KB)
Cao, C., T. Herault, G. Bosilca, and J. Dongarra, Design for a Soft Error Resilient Dynamic Task-based Runtime,” ICL Technical Report, no. ICL-UT-14-04: University of Tennessee, November 2014.  (2.61 MB)

Pages