Publications

Search

Show only items where

Author

Type

Term

Year

Keyword

Export 1275 results:

2013

Haidar, A., S. Tomov, J. Dongarra, R. Solcà, and T. C. Schulthess, “Leading Edge Hybrid Multi-GPU Algorithms for Generalized Eigenproblems in Electronic Structure Calculations,” International Supercomputing Conference (ISC), Lecture Notes in Computer Science, vol. 7905, Leipzig, Germany, Springer Berlin Heidelberg, pp. 67-80, June 2013.

(2.14 MB)

Gustavson, F. G., J. Wasniewski, J. Dongarra, J. Herrero, and J. Langou, “Level-3 Cholesky Factorization Routines Improve Performance of Many Cholesky Algorithms,” ACM Transactions on Mathematical Software (TOMS), vol. 39, issue 2, February 2013.

(439.46 KB)

Kurzak, J., P. Luszczek, and J. Dongarra, “LU Factorization with Partial Pivoting for a Multicore System with Accelerators,” IEEE Transactions on Parallel and Distributed Computing, vol. 24, issue 8, pp. 1613-1621, August 2013.

(1.08 MB)

Bouteiller, A., F. Cappello, J. Dongarra, A. Guermouche, T. Herault, and Y. Robert, “Multi-criteria checkpointing strategies: optimizing response-time versus resource utilization,” University of Tennessee Computer Science Technical Report, no. ICL-UT-13-01, February 2013.

(497.64 KB)

Bouteiller, A., F. Cappello, J. Dongarra, A. Guermouche, T. Herault, and Y. Robert, “Multi-criteria Checkpointing Strategies: Response-Time versus Resource Utilization,” Euro-Par 2013, Aachen, Germany, Springer, August 2013.

(431.84 KB)

Kurzak, J., P. Luszczek, A. YarKhan, M. Faverge, J. Langou, H. Bouwmeester, and J. Dongarra, “Multithreading in the PLASMA Library,” Multi and Many-Core Processing: Architecture, Programming, Algorithms, & Applications: Taylor & Francis, 00 2013.

(536.28 KB)

Weaver, V., D. Terpstra, and S. Moore, “Non-Determinism and Overcount on Modern Hardware Performance Counter Implementations,” 2013 IEEE International Symposium on Performance Analysis of Systems and Software, Austin, TX, IEEE, April 2013.

(307.24 KB)

Aupy, G., A. Benoit, T. Herault, Y. Robert, and J. Dongarra, “Optimal Checkpointing Period: Time vs. Energy,” University of Tennessee Computer Science Technical Report (also LAWN 281), no. ut-eecs-13-718: University of Tennessee, October 2013.

(440.13 KB)

Weaver, V., D. Terpstra, H. McCraw, M. Johnson, K. Kasichayanula, J. Ralph, J. Nelson, P. Mucci, T. Mohan, and S. Moore, PAPI 5: Measuring Power, Energy, and the Cloud , Austin, TX, 2013 IEEE International Symposium on Performance Analysis of Systems and Software, April 2013.

(78.39 KB)

Jia, Y., G. Bosilca, P. Luszczek, and J. Dongarra, “Parallel Reduction to Hessenberg Form with Algorithm-Based Fault Tolerance,” International Conference for High Performance Computing, Networking, Storage and Analysis, IEEE-SC 2013, Denver, CO, November 2013.

(147.09 KB)

Wang, Y., M. Baboulin, J. Falcou, Y. Fraigneau, and O. Le Maître, “A Parallel Solver for Incompressible Fluid Flows,” International Conference on Computational Science (ICCS 2013), Barcelona, Spain, Elsevier B.V., June 2013.

(588.79 KB)

Bosilca, G., A. Bouteiller, A. Danalis, M. Faverge, T. Herault, and J. Dongarra, “PaRSEC: Exploiting Heterogeneity to Enhance Scalability,” IEEE Computing in Science and Engineering, vol. 15, issue 6, pp. 36-45, November 2013.

(2.16 MB)

Dongarra, J., “Performance of Various Computers Using Standard Linear Equations Software,” University of Tennessee Computer Science Technical Report, no. cs-89-85, February 2013.

(539.24 KB)

Dongarra, J., M. Gates, A. Haidar, Y. Jia, K. Kabir, P. Luszczek, and S. Tomov, “Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi,” PPAM 2013, Warsaw, Poland, September 2013.

(284.97 KB)

Bland, W., A. Bouteiller, T. Herault, G. Bosilca, and J. Dongarra, “Post-failure recovery of MPI communication capability: Design and rationale,” International Journal of High Performance Computing Applications, vol. 27, issue 3, pp. 244 - 254, January 2013.

(285.77 KB)

Dongarra, J., T. Herault, and Y. Robert, “Revisiting the Double Checkpointing Algorithm,” 15th Workshop on Advances in Parallel and Distributed Computational Models, at the IEEE International Parallel & Distributed Processing Symposium, Boston, MA, May 2013.

(591.1 KB)

Dongarra, J., T. Herault, and Y. Robert, “Revisiting the Double Checkpointing Algorithm,” University of Tennessee Computer Science Technical Report (LAWN 274), no. ut-cs-13-705, January 2013.

(682.22 KB)

Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, J. Kurzak, P. Luszczek, S. Tomov, and J. Dongarra, “Scalable Dense Linear Algebra on Heterogeneous Hardware,” HPC: Transition Towards Exascale Processing, in the series Advances in Parallel Computing, 2013.

(760.32 KB)

Du, P., P. Luszczek, S. Tomov, and J. Dongarra, “Soft Error Resilient QR Factorization for Hybrid System with GPGPU,” Journal of Computational Science, vol. 4, issue 6, pp. 457–464, November 2013.

(995.45 KB)

Mattson, T., D. Bader, J. Berry, A. Buluc, J. Dongarra, C. Faloutsos, J. Feo, J. Gilbert, J. Gonzalez, B. Hendrickson, et al., “Standards for Graph Algorithm Primitives,” 17th IEEE High Performance Extreme Computing Conference (HPEC '13), Waltham, MA, IEEE, September 2013.

(108.86 KB)

Heroux, M. A., and J. Dongarra, “Toward a New Metric for Ranking High Performance Computing Systems,” SAND2013 - 4744, June 2013.

(225.32 KB)

Haidar, A., M. Gates, S. Tomov, and J. Dongarra, “Toward a scalable multi-GPU eigensolver via compute-intensive kernels and efficient communication,” Proceedings of the 27th ACM International Conference on Supercomputing (ICS '13), Eugene, Oregon, USA, ACM Press, June 2013.

(1.27 MB)

Jia, Y., P. Luszczek, and J. Dongarra, “Transient Error Resilient Hessenberg Reduction on GPU-based Hybrid Architectures,” UT-CS-13-712: University of Tennessee Computer Science Technical Report, June 2013.

(206.42 KB)

Yamazaki, I., T. Dong, R. Solcà, S. Tomov, J. Dongarra, and T. C. Schulthess, “Tridiagonalization of a dense symmetric matrix on multiple GPUs and its application to symmetric eigenvalue problems,” Concurrency and Computation: Practice and Experience, October 2013.

(1.71 MB)

Yamazaki, I., T. Dong, S. Tomov, and J. Dongarra, “Tridiagonalization of a Symmetric Dense Matrix on a GPU Cluster,” The Third International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), May 2013.

Bosilca, G., A. Bouteiller, E. Brunet, F. Cappello, J. Dongarra, A. Guermouche, T. Herault, Y. Robert, F. Vivien, and D. Zaidouni, “Unified Model for Assessing Checkpointing Protocols at Extreme-Scale,” Concurrency and Computation: Practice and Experience, November 2013.

(894.61 KB)

Kurzak, J., P. Luszczek, M. Gates, I. Yamazaki, and J. Dongarra, “Virtual Systolic Array for QR Decomposition,” 15th Workshop on Advances in Parallel and Distributed Computational Models, IEEE International Parallel & Distributed Processing Symposium (IPDPS 2013), Boston, MA, IEEE, May 2013.

(749.84 KB)

2014

Gates, M., A. Haidar, and J. Dongarra, “Accelerating Eigenvector Computation in the Nonsymmetric Eigenvalue Problem,” VECPAR 2014, Eugene, OR, June 2014.

(199.44 KB)

Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, “Accelerating Numerical Dense Linear Algebra Calculations with GPUs,” Numerical Computations with GPUs: Springer International Publishing, pp. 3-28, 2014.

(1.06 MB)

Anzt, H., S. Tomov, and J. Dongarra, “Accelerating the LOBPCG method on GPUs using a blocked Sparse Matrix Vector Product,” University of Tennessee Computer Science Technical Report, no. UT-EECS-14-731: University of Tennessee, October 2014.

(1.83 MB)

Yamazaki, I., T. Mary, J. Kurzak, S. Tomov, and J. Dongarra, “Access-averse Framework for Computing Low-rank Matrix Approximations,” First International Workshop on High Performance Big Graph Data Management, Analysis, and Mining, Washington, DC, October 2014.

Dongarra, J., M. Faverge, H. Ltaeif, and P. Luszczek, “Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting,” Concurrency and Computation: Practice and Experience, vol. 26, issue 7, pp. 1408-1431, May 2014.

(1.96 MB)

Nelson, J., “Analyzing PAPI Performance on Virtual Machines,” VMWare Technical Journal, vol. Winter 2013, January 2014.

Genet, D., A. Guermouche, and G. Bosilca, “Assembly Operations for Multicore Architectures using Task-Based Runtime Systems,” Euro-Par 2014, Porto, Portugal, Springer International Publishing, August 2014.

(481.52 KB)

Bosilca, G., A. Bouteiller, T. Herault, Y. Robert, and J. Dongarra, “Assessing the Impact of ABFT and Checkpoint Composite Strategies,” 16th Workshop on Advances in Parallel and Distributed Computational Models, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.

(1.02 MB)

Cao, C., J. Dongarra, P. Du, M. Gates, P. Luszczek, and S. Tomov, “clMAGMA: High Performance Dense Linear Algebra with OpenCL ,” International Workshop on OpenCL, Bristol University, England, May 2014.

(460.91 KB)

Ballard, G., D. Becker, J. Demmel, J. Dongarra, A. Druinsky, I. Peled, O. Schwartz, S. Toledo, and I. Yamazaki, “Communication-Avoiding Symmetric-Indefinite Factorization,” SIAM Journal on Matrix Analysis and Application, vol. 35, issue 4, pp. 1364-1406, July 2014.

(593.18 KB)

Baboulin, M., J. Dongarra, and R. Lacroix, “Computing Least Squares Condition Numbers on Hybrid Multicore/GPU Systems,” International Interdisciplinary Conference on Applied Mathematics, Modeling and Computational Science (AMMCS), Waterloo, Ontario, CA, August 2014.

(130.18 KB)

Yamazaki, I., S. Tomov, and J. Dongarra, “Deflation Strategies to Improve the Convergence of Communication-Avoiding GMRES,” 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, New Orleans, LA, November 2014.

(465.52 KB)

Yamazaki, I., J. Kurzak, P. Luszczek, and J. Dongarra, “Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime,” Workshop on Large-Scale Parallel Processing, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.

(398.16 KB)

Cao, C., T. Herault, G. Bosilca, and J. Dongarra, “Design for a Soft Error Resilient Dynamic Task-based Runtime,” ICL Technical Report, no. ICL-UT-14-04: University of Tennessee, November 2014.

(2.61 MB)

Faverge, M., J. Herrmann, J. Langou, B. Lowery, Y. Robert, and J. Dongarra, “Designing LU-QR Hybrid Solvers for Performance and Stability,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014.

(4.2 MB)

Yamazaki, I., S. Rajamanickam, E. G. Boman, M. Hoemmen, M. A. Heroux, and S. Tomov, “Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU Cluster,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 14), New Orleans, LA, IEEE, November 2014.

Donfack, S., S. Tomov, and J. Dongarra, “Dynamically balanced synchronization-avoiding LU factorization with multicore and GPUs,” Fourth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2014, May 2014.

(490.08 KB)

Benoit, A., Y. Robert, and S. K. Raina, “Efficient checkpoint/verification patterns for silent error detection,” Innovative Computing Laboratory Technical Report, no. ICL-UT-14-03: University of Tennessee, May 2014.

(397.75 KB)

Baboulin, M., D. Becker, G. Bosilca, A. Danalis, and J. Dongarra, “An Efficient Distributed Randomized Algorithm for Solving Large Dense Symmetric Indefinite Linear Systems,” Parallel Computing, vol. 40, issue 7, pp. 213-223, July 2014.

(1.42 MB)

Dong, T., A. Haidar, S. Tomov, and J. Dongarra, “A Fast Batched Cholesky Factorization on a GPU,” International Conference on Parallel Processing (ICPP-2014), Minneapolis, MN, September 2014.

(1.37 MB)

Haidar, A., P. Luszczek, S. Tomov, and J. Dongarra, “Heterogeneous Acceleration for Linear Algebra in Mulit-Coprocessor Environments,” VECPAR 2014, Eugene, OR, June 2014.

(276.52 KB)

Lukarski, D., H. Anzt, S. Tomov, and J. Dongarra, “Hybrid Multi-Elimination ILU Preconditioners on GPUs,” International Heterogeneity in Computing Workshop (HCW), IPDPS 2014, Phoenix, AZ, IEEE, May 2014.

(1.67 MB)

Anzt, H., S. Tomov, and J. Dongarra, “Implementing a Sparse Matrix Vector Product for the SELL-C/SELL-C-σ formats on NVIDIA GPUs,” University of Tennessee Computer Science Technical Report, no. UT-EECS-14-727: University of Tennessee, April 2014.

(578.11 KB)

Pages