Publications

Export 838 results:
Filters: Author is Jack Dongarra  [Clear All Filters]
2012
Ma, T., G. Bosilca, A. Bouteiller, and J. Dongarra, HierKNEM: An Adaptive Framework for Kernel-Assisted and Topology-Aware Collective Communications on Many-core Clusters,” IPDPS 2012 (Best Paper), Shanghai, China, May 2012.  (165.9 KB)
Dongarra, J., and A. J. van der Steen, High Performance Computing Systems: Status and Outlook,” Acta Numerica, vol. 21, Cambridge, UK, Cambridge University Press, pp. 379-474, May 2012.  (1.48 MB)
Du, P., P. Luszczek, and J. Dongarra, High Performance Dense Linear System Solver with Resilience to Multiple Soft Errors,” ICCS 2012, Omaha, NE, June 2012.  (1.27 MB)
Dongarra, J., and P. Luszczek, HPC Challenge: Design, History, and Implementation Highlights,” On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing (to appear): Chapman & Hall/CRC Press, 00 2012.  (469.92 KB)
Kurzak, J., R. Nath, P. Du, and J. Dongarra, An Implementation of the Tile QR Factorization for a GPU and Multiple CPUs,” Applied Parallel and Scientific Computing, vol. 7133, pp. 248-257, 00 2012.  (623.5 KB)
Luszczek, P., J. Kurzak, and J. Dongarra, Looking Back at Dense Linear Algebra Software,” Perspectives on Parallel and Distributed Processing: Looking Back and What's Ahead (to appear), 00 2012.  (235.91 KB)
Tomov, S., J. Dongarra, A. Haidar, I. Yamazaki, T. Dong, T. Schulthess, and R. Solcà, MAGMA: A Breakthrough in Solvers for Eigenvalue Problems , San Jose, CA, GPU Technology Conference (GTC12), Presentation, May 2012.  (9.23 MB)
Dongarra, J., T. Dong, M. Gates, A. Haidar, S. Tomov, and I. Yamazaki, MAGMA: A New Generation of Linear Algebra Library for GPU and Multicore Architectures , Salt Lake City, UT, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC12), Presentation, November 2012.  (4.69 MB)
Dongarra, J., M. Gates, Y. Jia, K. Kabir, P. Luszczek, and S. Tomov, MAGMA MIC: Linear Algebra Library for Intel Xeon Phi Coprocessors , Salt Lake City, UT, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC12), November 2012.  (6.4 MB)
Agullo, E., G. Bosilca, C. Castagnède, J. Dongarra, H. Ltaeif, and S. Tomov, Matrices Over Runtime Systems at Exascale,” Supercomputing '12 (poster), Salt Lake City, Utah, November 2012.
Solcà, R., A. Haidar, S. Tomov, J. Dongarra, and T. C. Schulthess, A Novel Hybrid CPU-GPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks,” Supercomputing '12 (poster), Salt Lake City, Utah, November 2012.
Yamazaki, I., S. Tomov, and J. Dongarra, One-Sided Dense Matrix Factorizations on a Multicore with Multiple GPU Accelerators,” The International Conference on Computational Science (ICCS), June 2012.
Abdelfattah, A., J. Dongarra, D. Keyes, and H. Ltaeif, Optimizing Memory-Bound Numerical Kernels on GPU Hardware Accelerators,” VECPAR 2012, Kobe, Japan, July 2012.  (737.28 KB)
Parallel Processing and Applied Mathematics, 9th International Conference, PPAM 2011,” Lecture Notes in Computer Science, vol. 7203, Torun, Poland, 00 2012.
Baboulin, M., D. Becker, and J. Dongarra, A Parallel Tiled Solver for Symmetric Indefinite Systems On Multicore Architectures,” IPDPS 2012, Shanghai, China, May 2012.  (544.09 KB)
Donfack, S., S. Tomov, and J. Dongarra, Performance evaluation of LU factorization through hardware counter measurements,” University of Tennessee Computer Science Technical Report, no. ut-cs-12-700, October 2012.  (794.82 KB)
Bosilca, G., J. Dongarra, and H. Ltaeif, Power Profiling of Cholesky and QR Factorizations on Distributed Memory Systems,” Third International Conference on Energy-Aware High Performance Computing, Hamburg, Germany, September 2012.  (290.27 KB)
Kurzak, J., P. Luszczek, S. Tomov, and J. Dongarra, Preliminary Results of Autotuning GEMM Kernels for the NVIDIA Kepler Architecture,” LAWN 267, 00 2012.  (1.14 MB)
Kurzak, J., P. Luszczek, M. Faverge, and J. Dongarra, Programming the LU Factorization for a Multicore System with Accelerators,” Proceedings of VECPAR’12, Kobe, Japan, April 2012.  (414.33 KB)
Bland, W., G. Bosilca, A. Bouteiller, T. Herault, and J. Dongarra, A Proposal for User-Level Failure Mitigation in the MPI-3 Standard,” University of Tennessee Electrical Engineering and Computer Science Technical Report, no. ut-cs-12-693: University of Tennessee, February 2012.  (159.46 KB)
Du, P., S. Tomov, and J. Dongarra, Providing GPU Capability to LU and QR within the ScaLAPACK Framework,” University of Tennessee Computer Science Technical Report (also LAWN 272), no. UT-CS-12-699, September 2012.  (7.48 MB)
Recent Advances in the Message Passing Interface: 19th European MPI Users' Group Meeting, EuroMPI 2012,” Lecture Notes in Computer Science, vol. 7490, Vienna, Austria, 00 2012.
Becker, D., M. Baboulin, and J. Dongarra, Reducing the Amount of Pivoting in Symmetric Indefinite Systems,” Parallel Processing and Applied Mathematics, Lecture Notes in Computer Science (PPAM 2011), vol. 7203: Springer-Verlag Berlin Heidelberg, pp. 133-142, 00 2012.  (145.76 KB)
Becker, D., M. Baboulin, and J. Dongarra, Reducing the Amount of Pivoting in Symmetric Indefinite Systems,” Parallel Processing and Applied Mathematics, Lecture Notes in Computer Science (PPAM 2011), vol. 7203: Springer-Verlag Berlin Heidelberg, pp. 133-142, 00 2012.  (145.76 KB)
Song, F., and J. Dongarra, A Scalable Framework for Heterogeneous GPU-Based Clusters,” The 24th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2012), Pittsburgh, PA, USA, ACM, June 2012.  (3.39 MB)
Haidar, A., H. Ltaeif, and J. Dongarra, Toward High Performance Divide and Conquer Eigensolver for Dense Symmetric Matrices,” SIAM Journal on Scientific Computing (Accepted), July 2012.
Bosilca, G., A. Bouteiller, E. Brunet, F. Cappello, J. Dongarra, A. Guermouche, T. Herault, Y. Robert, F. Vivien, and D. Zaidouni, Unified Model for Assessing Checkpointing Protocols at Extreme-Scale,” University of Tennessee Computer Science Technical Report (also LAWN 269), no. UT-CS-12-697, June 2012.  (2.76 MB)
Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, Weighted Block-Asynchronous Iteration on GPU-Accelerated Systems,” Tenth International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (Best Paper), Rhodes Island, Greece, August 2012.  (764.02 KB)
Anzt, H., J. Dongarra, and V. Heuveline, Weighted Block-Asynchronous Relaxation for GPU-Accelerated Systems,” SIAM Journal on Computing (submitted), March 2012.  (811.01 KB)
2011
Baboulin, M., J. Dongarra, J. Herrmann, and S. Tomov, Accelerating Linear System Solutions Using Randomization Techniques,” INRIA RR-7616 / LAWN #246 (presented at International AMMCS’11), Waterloo, Ontario, Canada, July 2011.  (358.79 KB)
Dongarra, J., M. Faverge, H. Ltaeif, and P. Luszczek, Achieving Numerical Accuracy and High Performance using Recursive Tile LU Factorization,” University of Tennessee Computer Science Technical Report (also as a LAWN), no. ICL-UT-11-08, September 2011.  (618.53 KB)
Du, P., A. Bouteiller, G. Bosilca, T. Herault, and J. Dongarra, Algorithm-based Fault Tolerance for Dense Matrix Factorizations,” University of Tennessee Computer Science Technical Report, no. UT-CS-11-676, Knoxville, TN, August 2011.  (865.79 KB)
Haidar, A., H. Ltaeif, A. YarKhan, and J. Dongarra, Analysis of Dynamically Scheduled Tile Algorithms for Dense Linear Algebra on Multicore Architectures,” University of Tennessee Computer Science Technical Report, UT-CS-11-666, (also Lawn 243), 00 2011.  (1.65 MB)
Kurzak, J., S. Tomov, and J. Dongarra, Autotuning GEMMs for Fermi,” University of Tennessee Computer Science Technical Report, UT-CS-11-671, (also Lawn 245), April 2011.  (397.45 KB)
Danalis, A., P. Luszczek, G. Marin, J. Vetter, and J. Dongarra, BlackjackBench: Hardware Characterization with Portable Micro-Benchmarks and Automatic Statistical Analysis of Results,” IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 2011.
Anzt, H., S. Tomov, M. Gates, J. Dongarra, and V. Heuveline, Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems , no. UT-CS-11-689, December 2011.  (608.95 KB)
Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, A Block-Asynchronous Relaxation Method for Graphics Processing Units,” University of Tennessee Computer Science Technical Report, no. UT-CS-11-687 / LAWN 258, November 2011.  (1.08 MB)
Luszczek, P., J. Kurzak, and J. Dongarra, Changes in Dense Linear Algebra Kernels - Decades Long Perspective,” in Solving the Schrodinger Equation: Has everything been tried? (to appear): Imperial College Press, 00 2011.
Horton, M., S. Tomov, and J. Dongarra, A Class of Hybrid LAPACK Algorithms for Multicore and GPU Architectures,” Symposium for Application Accelerators in High Performance Computing (SAAHPC'11), Knoxville, TN, July 2011.  (329.68 KB)
Bouteiller, A., T. Herault, G. Bosilca, and J. Dongarra, Correlated Set Coordination in Fault Tolerant Message Logging Protocols,” Proceedings of 17th International Conference, Euro-Par 2011, Part II, vol. 6853, Bordeaux, France, Springer, pp. 51-64, August 2011.  (486.68 KB)
Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Lemariner, and J. Dongarra, DAGuE: A Generic Distributed DAG Engine for High Performance Computing,” Proceedings of the Workshops of the 25th IEEE International Symposium on Parallel and Distributed Processing (IPDPS 2011 Workshops), Anchorage, Alaska, USA, IEEE, pp. 1151-1158, 00 2011.  (830.85 KB)
Song, F., S. Tomov, and J. Dongarra, Efficient Support for Matrix Computations on Heterogeneous Multi-core and Multi-GPU Architectures,” University of Tennessee Computer Science Technical Report, UT-CS-11-668, (also Lawn 250), June 2011.  (5.93 MB)
Luszczek, P., E. Meek, S. Moore, D. Terpstra, V. M. Weaver, and J. Dongarra, Evaluation of the HPC Challenge Benchmarks in Virtualized Environments,” 6th Workshop on Virtualization in High-Performance Cloud Computing, Bordeaux, France, August 2011.  (114.73 KB)
Dongarra, J., M. Faverge, H. Ltaeif, and P. Luszczek, Exploiting Fine-Grain Parallelism in Recursive LU Factorization,” Proceedings of PARCO'11, no. ICL-UT-11-04, Gent, Belgium, April 2011.
Bosilca, G., A. Bouteiller, A. Danalis, M. Faverge, A. Haidar, T. Herault, J. Kurzak, J. Langou, P. Lemariner, H. Ltaeif, et al., Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA,” Proceedings of the Workshops of the 25th IEEE International Symposium on Parallel and Distributed Processing (IPDPS 2011 Workshops), Anchorage, Alaska, USA, IEEE, pp. 1432-1441, May 2011.  (1.26 MB)
Anzt, H., P. Luszczek, J. Dongarra, and V. Heuveline, GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement,” University of Tennessee Computer Science Technical Report UT-CS-11-690 (also Lawn 260), December 2011.  (662.98 KB)
Dongarra, J., M. Faverge, T. Herault, J. Langou, and Y. Robert, Hierarchical QR Factorization Algorithms for Multi-Core Cluster Systems,” University of Tennessee Computer Science Technical Report (also Lawn 257), no. UT-CS-11-684, October 2011.  (405.71 KB)
Ltaeif, H., P. Luszczek, and J. Dongarra, High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures,” University of Tennessee Computer Science Technical Report, UT-CS-11-673, (also Lawn 247), May 2011.  (424.93 KB)
Du, P., P. Luszczek, and J. Dongarra, High Performance Dense Linear System Solver with Soft Error Resilience,” IEEE Cluster 2011, Austin, TX, September 2011.  (1.27 MB)
Dongarra, J., M. Faverge, H. Ltaeif, and P. Luszczek, High Performance Matrix Inversion Based on LU Factorization for Multicore Architectures,” Proceedings of MTAGS11, Seattle, WA, November 2011.  (879.49 KB)

Pages