Papers

Error message

Notice: Undefined index: n in biblio_arg_handler() (line 40 of /sw/apps/drupal-7.31/sites/all/modules/biblio/includes/biblio.pages.inc).
Export 671 results:
2014
Gates, M., A. Haidar, and J. Dongarra, "Accelerating Eigenvector Computation in the Nonsymmetric Eigenvalue Problem", VECPAR 2014, Eugene, OR, 06/2014.  (199.44 KB)
Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, "Accelerating Numerical Dense Linear Algebra Calculations with GPUs", Numerical Computations with GPUs: Springer International Publishing, pp. 3-28, 2014.  (1.06 MB)
Nelson, J., "Analyzing PAPI Performance on Virtual Machines", VMWare Technical Journal, vol. Winter 2013, 01/2014.
Bosilca, G., A. Bouteiller, T. Herault, Y. Robert, and J. Dongarra, "Assessing the Impact of ABFT and Checkpoint Composite Strategies", 16th Workshop on Advances in Parallel and Distributed Computational Models, IPDPS 2014, Phoenix, AZ, IEEE, 05/2014.  (1.02 MB)
Ballard, G., D. Becker, J. Demmel, J. Dongarra, A. Druinsky, I.. Peled, O. Schwartz, S. Toledo, and I. Yamazaki, "Communication-Avoiding Symmetric-Indefinite Factorization", SIAM Journal on Matrix Analysis and Application (to appear), 07/2014.  (593.18 KB)
Yamazaki, I., J. Kurzak, P. Luszczek, and J. Dongarra, "Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime", Workshop on Large-Scale Parallel Processing, IPDPS 2014, Phoenix, AZ, IEEE, 05/2014.  (398.16 KB)
Faverge, M., J. Herrmann, J. Langou, B. Lowery, Y. Robert, and J. Dongarra, "Designing LU-QR Hybrid Solvers for Performance and Stability", IPDPS 2014, Phoenix, AZ, IEEE, 05/2014.  (4.2 MB)
Benoit, A., Y. Robert, and S. K. Raina, "Efficient checkpoint/verification patterns for silent error detection", Innovative Computing Laboratory Technical Report, no. ICL-UT-14-03: University of Tennessee, 05/2014.  (397.75 KB)
Dong, T., A. Haidar, S. Tomov, and J. Dongarra, "A Fast Batched Cholesky Factorization on a GPU", International Conference on Parallel Processing (ICPP-2014), Minneapolis, MN, 09/2014.  (1.37 MB)
Haidar, A., P. Luszczek, S. Tomov, and J. Dongarra, "Heterogenous Acceleration for Linear Algebra in Mulit-Coprocessor Environments", VECPAR 2014, Eugene, OR, 06/2014.  (276.52 KB)
Lukarski, D., H. Anzt, S. Tomov, and J. Dongarra, "Hybrid Multi-Elimination ILU Preconditioners on GPUs", International Heterogeneity in Computing Workshop (HCW), IPDPS 2014, Phoenix, AZ, IEEE, 05/2014.  (1.67 MB)
Yamazaki, I., H. Anzt, S. Tomov, M. Hoemmen, and J. Dongarra, "Improving the performance of CA-GMRES on multicores with multiple GPUs", IPDPS 2014, Phoenix, AZ, IEEE, 05/2014.  (333.82 KB)
Luszczek, P., J. Kurzak, and J. Dongarra, "Looking Back at Dense Linear Algebra Software", Journal of Parallel and Distributed Computing, vol. 74, issue 7, pp. 2548–2560, 07/2014.  (1.79 MB)
Marin, G., J. Dongarra, and D. Terpstra, "MIAMI: A Framework for Application Performance Diagnosis ", IPASS-2014, Monterey, CA, IEEE, 03/2014.  (1010.75 KB)
Yamazaki, I., S. Tomov, T. Dong, and J. Dongarra, "Mixed-precision orthogonalization scheme and its case studies with CA-GMRES on a GPU", VECPAR 2014, Eugene, OR, 06/2014.  (438.54 KB)
Haidar, A., P. Luszczek, and J. Dongarra, "New Algorithm for Computing Eigenvectors of the Symmetric Eigenvalue Problem", Workshop on Parallel and Distributed Scientific and Engineering Computing, IPDPS 2014, Phoenix, AZ, IEEE, 05/2014.  (925.66 KB)
Haidar, A., R. Solcà, M. Gates, S. Tomov, T. C. Schulthess, and J. Dongarra, "A Novel Hybrid CPU-GPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks", International Journal of High Performance Computing Applications, vol. 28, issue 2, pp. 196-209, 05/2014.  (1.74 MB)
Marin, G., "Performance Analysis of the MPAS-Ocean Code using HPCToolkit and MIAMI", ICL Technical Report, no. ICL-UT-14-01: University of Tennessee, 02/2014.  (894.39 KB)
Dongarra, J., T. Herault, and Y. Robert, "Performance and Reliability Trade-offs for the Double Checkpointing Algorithm", International Journal of Networking and Computing, vol. 4, no. 1, pp. 32-41, 2014.  (859.04 KB)
McCraw, H., J. Ralph, A. Danalis, and J. Dongarra, "Power Monitoring with PAPI for Extreme Scale Architectures and Dataflow-based Programming Models", Workshop on Monitoring and Analysis for High Performance Computing Systems Plus Applications, IEEE Cluster 2014, no. ICL-UT-14-04, Madrid, Spain, IEEE, 09/2014.  (34.67 KB)
Anzt, H., D. Lukarski, S. Tomov, and J. Dongarra, "Self-Adaptive Multiprecision Preconditioners on Multicore and Manycore Architectures", VECPAR 2014, Eugene, OR, 06/2014.  (430.56 KB)
Dong, T., V. Dobrev, T. Kolev, R. Rieben, S. Tomov, and J. Dongarra, "A Step towards Energy Efficient Computing: Redesigning A Hydrodynamic Application on CPU-GPU", IPDPS 2014, Phoenix, AZ, IEEE, 05/2014.  (1.01 MB)
Lacoste, X., M. Faverge, P. Ramet, S. Thibault, and G. Bosilca, "Taking Advantage of Hybrid Systems for Sparse Direct Solvers via Task-Based Runtimes", 23rd International Heterogeneity in Computing Workshop, IPDPS 2014, Phoenix, AZ, IEEE, 05/2014.  (807.33 KB)
Haidar, A., C. Cao, J. Dongarra, P. Luszczek, and S. Tomov, "Unified Development for Mixed Multi-GPU and Multi-Coprocessor Environments using a Lightweight Runtime Environment", IPDPS 2014, Phoenix, AZ, IEEE, 05/2014.  (1.51 MB)
McCraw, H., A. Danalis, G. Bosilca, J. Dongarra, K. Kowalski, and T. Windus, "Utilizing Dataflow-based Execution for Coupled Cluster Methods", IEEE Cluster 2014, no. ICL-UT-14-02, Madrid, Spain, IEEE, 09/2014.  (34.67 KB)
2013
Dongarra, J., M. Faverge, H. Ltaeif, and P. Luszczek, "Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting", Concurrency and Computation: Practice and Experience, 09/2013.  (1.96 MB)
Nelson, J., "Analyzing PAPI Performance on Virtual Machines", ICL Technical Report, no. ICL-UT-13-02, 08/2013.  (437.37 KB)
Bosilca, G., A. Bouteiller, T. Herault, Y. Robert, and J. Dongarra, "Assessing the impact of ABFT and Checkpoint composite strategies", University of Tennessee Computer Science Technical Report, no. ICL-UT-13-03, 2013.  (968.47 KB)
Danalis, A., P. Luszczek, G. Marin, J. Vetter, and J. Dongarra, "BlackjackBench: Portable Hardware Characterization with Automated Results Analysis", The Computer Journal, 03/2013.  (408.45 KB)
Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, "A Block-Asynchronous Relaxation Method for Graphics Processing Units", Journal of Parallel and Distributed Computing, vol. 73, issue 12, pp. 1613–1626, 12/2013.  (1.08 MB)
Cao, C., J. Dongarra, P. Du, M. Gates, P. Luszczek, and S. Tomov, "clMAGMA: High Performance Dense Linear Algebra with OpenCL", University of Tennessee Technical Report (Lawn 275), no. UT-CS-13-706: University of Tennessee, 03/2013.  (526.6 KB)
Aupy, G., A. Benoit, T. Herault, Y. Robert, F. Vivien, and D. Zaidouni, "On the Combination of Silent Error Detection and Checkpointing", UT-CS-13-710: University of Tennessee Computer Science Technical Report, 06/2013.  (1.29 MB)
Bouteiller, A., T. Herault, G. Bosilca, and J. Dongarra, "Correlated Set Coordination in Fault Tolerant Message Logging Protocols", Concurrency and Computation: Practice and Experience, vol. 25, issue 4, pp. 572-585, 03/2013.  (636.68 KB)
Jia, Y., P. Luszczek, G. Bosilca, and J. Dongarra, "CPU-GPU Hybrid Bidiagonal Reduction With Soft Error Resilience", ScalA '13 Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Montpellier, France, 11/2013.  (238.58 KB)
Faverge, M., J. Herrmann, J. Langou, B. Lowery, Y. Robert, and J. Dongarra, "Designing LU-QR hybrid solvers for performance and stability", University of Tennessee Computer Science Technical Report (also LAWN 282), no. ut-eecs-13-719: University of Tennessee, 10/2013.  (4.11 MB)
Marin, G., C. McCurdy, and J. Vetter, "Diagnosis and Optimization of Application Prefetching Performance", Proceedings of the 27th ACM International Conference on Supercomputing (ICS '13), Eugene, Oregon, USA, ACM Press, 06/2013.  (827.31 KB)
Donfack, S., S. Tomov, and J. Dongarra, "Dynamically balanced synchronization-avoiding LU factorization with multicore and GPUs", University of Tennessee Computer Science Technical Report, no. ut-cs-13-713, 07/2013.  (659.77 KB)
Turchenko, V., G. Bosilca, A. Bouteiller, and J. Dongarra, "Efficient Parallelization of Batch Pattern Training Algorithm on Many-core and Cluster Architectures", 7th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems, Berlin, Germany, 09/2013.  (102.51 KB)
Li, Y., A. YarKhan, J. Dongarra, K. Seymour, and A. Hurault, "Enabling Workflows in GridSolve: Request Sequencing and Service Trading", Journal of Supercomputing, vol. 64, issue 3, pp. 1133-1152, 06/2013.  (821.29 KB)
Bland, W., P. Du, A. Bouteiller, T. Herault, G. Bosilca, and J. Dongarra, "Extending the scope of the Checkpoint-on-Failure protocol for forward recovery in standard MPI", Concurrency and Computation: Practice and Experience, 07/2013.  (3.89 MB)
Dongarra, J., M. Faverge, T. Herault, M. Jacquelin, J. Langou, and Y. Robert, "Hierarchical QR Factorization Algorithms for Multi-core Cluster Systems", Parallel Computing, vol. 39, issue 4-5, pp. 212-232, 05/2013.  (1.43 MB)
Ltaeif, H., P. Luszczek, and J. Dongarra, "High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures", ACM Transactions on Mathematical Software (TOMS), vol. 39, issue 3, no. 16, 2013.  (665.7 KB)
Dongarra, J., and P. Luszczek, "HPC Challenge: Design, History, and Implementation Highlights", Contemporary High Performance Computing: From Petascale Toward Exascale, Boca Raton, FL, Taylor and Francis, 2013.  (790.01 KB)
Dong, T., V. Dobrev, T. Kolev, R. Rieben, S. Tomov, and J. Dongarra, "Hydrodynamic Computation with Hybrid Programming on CPU-GPU Clusters", University of Tennessee Computer Science Technical Report, no. ut-cs-13-714, 07/2013.  (866.68 KB)
Yamazaki, I., D. Becker, J. Dongarra, A. Druinsky, I.. Peled, S. Toledo, G. Ballard, J. Demmel, and O. Schwartz, "Implementing a Blocked Aasen’s Algorithm with a Dynamic Scheduler on Multicore Architectures", IPDPS 2013 (submitted), Boston, MA, 00/2013.  (1.22 MB)
Aupy, G., M. Faverge, Y. Robert, J. Kurzak, P. Luszczek, and J. Dongarra, "Implementing a systolic algorithm for QR factorization on multicore clusters with PaRSEC", Lawn 277, no. UT-CS-13-709, 05/2013.  (298.63 KB)
Haidar, A., P. Luszczek, J. Kurzak, and J. Dongarra, "An Improved Parallel Singular Value Algorithm and Its Implementation for Multicore Hardware", Supercomputing 2013, Denver, CO, November 2013.
Haidar, A., P. Luszczek, J. Kurzak, and J. Dongarra, "An Improved Parallel Singular Value Algorithm and Its Implementation for Multicore Hardware", University of Tennessee Computer Science Technical Report (also LAWN 283), no. ut-eecs-13-720: University of Tennessee, 10/2013.  (1.23 MB)
Vetter, J., R. Glassbrook, K. Schwan, S. Yalamanchili, M. Horton, A. Gavrilovska, M. Slawinska, J. Dongarra, J. Meredith, P. Roth, et al., "Keeneland: Computational Science Using Heterogeneous GPU Computing", Contemporary High Performance Computing: From Petascale Toward Exascale, Boca Raton, FL, Taylor and Francis, 2013.  (2.7 MB)
Ma, T., G. Bosilca, A. Bouteiller, and J. Dongarra, "Kernel-assisted and topology-aware MPI collective communications on multi-core/many-core platforms", Journal of Parallel and Distributed Computing, vol. 73, issue 7, pp. 1000-1010, 07/2013.  (1.4 MB)

Pages