Publications

1999

Eijkhout, V., “On the Existence Problem of Incomplete Factorisation Methods,” University of Tennessee Computer Science Department Technical Report, no. UT-CS-99-435, December 1999.

(222.2 KB)

Fischer, M., and J. Dongarra, “Experiences with Windows 95/NT as a Cluster Computing Platform for Parallel Computing,” Parallel and Distributed Computing Practices, Special Issue: Cluster Computing, vol. 2, no. 2: Nova Science Publishers, USA, pp. 119-128, February 1999.

(164.04 KB)

2001

Beck, M., T. Moore, L. Abrahamsson, C. Achouiantz, and P. Johansson, “Enabling Full Service Surrogates Using the Portable Channel Representation,” Tenth International World Wide Web Conference Proceedings (to appear),, Hong Kong, May 2001.

(267.23 KB)

London, K., J. Dongarra, S. Moore, P. Mucci, K. Seymour, and T.. Spencer, “End-user Tools for Application Performance Analysis, Using Hardware Counters,” International Conference on Parallel and Distributed Computing Systems, Dallas, TX, August 2001.

(306.54 KB)

2002

YarKhan, A., and J. Dongarra, “Experiments with Scheduling Using Simulated Annealing in a Grid Environment,” Grid Computing - GRID 2002, Third International Workshop, vol. 2536, Baltimore, MD, Springer, pp. 232-242, November 2002.

(66.91 KB)

2003

Hiroyasu, T., M. Miki, S. Ogura, K. Aoi, T. Yoshida, Y. Okamoto, and J. Dongarra, “Energy Minimization of Protein Tertiary Structure by Parallel Simulated Annealing using Genetic Crossover,” Special Issue on Biological Applications of Genetic and Evolutionary Computation (submitted), March 2003.

(438.68 KB)

Gabriel, E., G. Fagg, and J. Dongarra, “Evaluating The Performance Of MPI-2 Dynamic Communicators And One-Sided Communication,” Lecture Notes in Computer Science, Recent Advances in Parallel Virtual Machine and Message Passing Interface, 10th European PVM/MPI User's Group Meeting, vol. 2840, Venice, Italy, Springer-Verlag, Berlin, pp. 88-97, September 2003.

(254.08 KB)

Dongarra, J., K. London, S. Moore, P. Mucci, D. Terpstra, H. You, and M. Zhou, “Experiences and Lessons Learned with a Portable Interface to Hardware Performance Counters,” PADTAD Workshop, IPDPS 2003, Nice, France, IEEE, April 2003.

(432.57 KB)

2004

Wolf, F., “EARL - API Documentation,” ICL Technical Report, no. ICL-UT-04-03, October 2004.

(111.36 KB)

Wolf, F., B. Mohr, J. Dongarra, and S. Moore, “Efficient Pattern Search in Large Traces through Successive Refinement,” Proceedings of Euro-Par 2004, Pisa, Italy, Springer-Verlag, August 2004.

(177.46 KB)

Fagg, G., E. Gabriel, G. Bosilca, T. Angskun, Z. Chen, J. Pjesivac–Grbovic, K. London, and J. Dongarra, “Extending the MPI Specification for Process Fault Tolerance on High Performance Computing Systems,” Proceedings of ISC2004 (to appear), Heidelberg, Germany, June 2004.

(548.38 KB)

2005

You, H., K. Seymour, and J. Dongarra, “An Effective Empirical Search Method for Automatic Software Tuning,” ICL Technical Report, no. ICL-UT-05-02, January 2005.

(74.66 KB)

Hermanns, M-A., B. Mohr, and F. Wolf, “Event-based Measurement and Analysis of One-sided Communication,” In Proceedings of the European Conference on Parallel Computing (Euro-Par), Lisbon, Portugal, Springer, August 2005.

(403.44 KB)

2006

Song, F., J. Dongarra, and S. Moore, “Experiments with Strassen's Algorithm: From Sequential to Parallel,” 18th IASTED International Conference on Parallel and Distributed Computing and Systems PDCS 2006 (submitted), Dallas, Texas, January 2006.

(514.33 KB)

Langou, J., J. Langou, P. Luszczek, J. Kurzak, A. Buttari, and J. Dongarra, “Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy,” University of Tennessee Computer Science Tech Report, no. UT-CS-06-574, LAPACK Working Note #175, April 2006.

(221.39 KB)

2007

You, H., K. Seymour, J. Dongarra, and S. Moore, “Empirical Tuning of a Multiresolution Analysis Kernel using a Specialized Code Generator,” ICL Technical Report, no. ICL-UT-07-02, January 2007.

(123.34 KB)

Graham, R. L., R. Brightwell, B. Barrett, G. Bosilca, and J. Pjesivac–Grbovic, “An Evaluation of Open MPI's Matching Transport Layer on the Cray XT,” EuroPVM/MPI 2007, September 2007.

(369.01 KB)

Buttari, A., J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek, and S. Tomov, “Exploiting Mixed Precision Floating Point Hardware in Scientific Computations,” In High Performance Computing and Grids in Action (to appear), Amsterdam, IOS Press, 00 2007.

(122.01 KB)

2008

Baboulin, M., J. Demmel, J. Dongarra, S. Tomov, and V. Volkov, Enhancing the Performance of Dense Linear Algebra Solvers on GPUs (in the MAGMA Project) , Austin, TX, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC08), November 2008.

(5.28 MB)

Buttari, A., J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek, and S. Tomov, “Exploiting Mixed Precision Floating Point Hardware in Scientific Computations,” in High Performance Computing and Grids in Action, Amsterdam, IOS Press, January 2008.

(92.95 KB)

Dongarra, J., S. Moore, G. D. Peterson, S. Tomov, J. Allred, V. Natoli, and D. Richie, “Exploring New Architectures in Accelerating CFD for Air Force Applications,” Proceedings of the DoD HPCMP User Group Conference, Seattle, Washington, January 2008.

(492.86 KB)

2009

Hadri, B., H. Ltaeif, E. Agullo, and J. Dongarra, “Enhancing Parallelism of Tile QR Factorization for Multicore Architectures,” Submitted to Transaction on Parallel and Distributed Systems, December 2009.

(464.23 KB)

2010

Dongarra, J., and S. Moore, “Empirical Performance Tuning of Dense Linear Algebra Software,” in Performance Tuning of Scientific Applications (to appear), 00 2010.

Dongarra, J., M. Faverge, Y. Ishikawa, R. Namyst, F. Rue, and F. Trahay, “EZTrace: a generic framework for performance analysis,” ICL Technical Report, no. ICL-UT-11-01, December 2010.

2011

Song, F., S. Tomov, and J. Dongarra, “Efficient Support for Matrix Computations on Heterogeneous Multi-core and Multi-GPU Architectures,” University of Tennessee Computer Science Technical Report, UT-CS-11-668, (also Lawn 250), June 2011.

(5.93 MB)

Lively, C., X. Wu, V. Taylor, S. Moore, H-C. Chang, and K. Cameron, “Energy and performance characteristics of different parallel implementations of scientific applications on multicore systems,” International Journal of High Performance Computing Applications, vol. 25, no. 3, pp. 342-350, 00 2011.

(467.18 KB)

Luszczek, P., E. Meek, S. Moore, D. Terpstra, V. M. Weaver, and J. Dongarra, “Evaluation of the HPC Challenge Benchmarks in Virtualized Environments,” 6th Workshop on Virtualization in High-Performance Cloud Computing, Bordeaux, France, August 2011.

(114.73 KB)

Dongarra, J., M. Faverge, H. Ltaeif, and P. Luszczek, “Exploiting Fine-Grain Parallelism in Recursive LU Factorization,” Proceedings of PARCO'11, no. ICL-UT-11-04, Gent, Belgium, April 2011.

2012

Baboulin, M., D. Becker, G. Bosilca, A. Danalis, and J. Dongarra, “An efficient distributed randomized solver with application to large dense linear systems,” ICL Technical Report, no. ICL-UT-12-02, July 2012.

(626.26 KB)

Song, F., S. Tomov, and J. Dongarra, “Enabling and Scaling Matrix Computations on Heterogeneous Multi-Core and Multi-GPU Systems,” 26th ACM International Conference on Supercomputing (ICS 2012), San Servolo Island, Venice, Italy, ACM, June 2012.

(5.88 MB)

Bland, W., “Enabling Application Resilience With and Without the MPI Standard,” 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Ottawa, Canada, May 2012.

(262.93 KB)

Dongarra, J., H. Ltaeif, P. Luszczek, and V. M. Weaver, “Energy Footprint of Advanced Dense Numerical Linear Algebra using Tile Algorithms on Multicore Architecture,” The 2nd International Conference on Cloud and Green Computing (submitted), Xiangtan, Hunan, China, November 2012.

(329.5 KB)

Ltaeif, H., P. Luszczek, and J. Dongarra, “Enhancing Parallelism of Tile Bidiagonal Transformation on Multicore Architectures using Tree Reduction,” Lecture Notes in Computer Science, vol. 7203, pp. 661-670, September 2012.

(185.77 KB)

Bland, W., A. Bouteiller, T. Herault, J. Hursey, G. Bosilca, and J. Dongarra, “An Evaluation of User-Level Failure Mitigation Support in MPI,” Proceedings of Recent Advances in Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI 2012, Vienna, Austria, Springer, September 2012.

Bland, W., P. Du, A. Bouteiller, T. Herault, G. Bosilca, and J. Dongarra, “Extending the Scope of the Checkpoint-on-Failure Protocol for Forward Recovery in Standard MPI,” University of Tennessee Computer Science Technical Report, no. ut-cs-12-702, 00 2012.

(422.76 KB)

2013

Turchenko, V., G. Bosilca, A. Bouteiller, and J. Dongarra, “Efficient Parallelization of Batch Pattern Training Algorithm on Many-core and Cluster Architectures,” 7th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems, Berlin, Germany, September 2013.

(102.51 KB)

Li, Y., A. YarKhan, J. Dongarra, K. Seymour, and A. Hurault, “Enabling Workflows in GridSolve: Request Sequencing and Service Trading,” Journal of Supercomputing, vol. 64, issue 3, pp. 1133-1152, June 2013.

(821.29 KB)

Bland, W., A. Bouteiller, T. Herault, J. Hursey, G. Bosilca, and J. Dongarra, “An evaluation of User-Level Failure Mitigation support in MPI,” Computing, vol. 95, issue 12, pp. 1171-1184, December 2013.

(311.23 KB)

Bland, W., P. Du, A. Bouteiller, T. Herault, G. Bosilca, and J. Dongarra, “Extending the scope of the Checkpoint-on-Failure protocol for forward recovery in standard MPI,” Concurrency and Computation: Practice and Experience, July 2013.

(3.89 MB)

2014

Benoit, A., Y. Robert, and S. K. Raina, “Efficient checkpoint/verification patterns for silent error detection,” Innovative Computing Laboratory Technical Report, no. ICL-UT-14-03: University of Tennessee, May 2014.

(397.75 KB)

Baboulin, M., D. Becker, G. Bosilca, A. Danalis, and J. Dongarra, “An Efficient Distributed Randomized Algorithm for Solving Large Dense Symmetric Indefinite Linear Systems,” Parallel Computing, vol. 40, issue 7, pp. 213-223, July 2014.

(1.42 MB)

2015

Benoit, A., S. K. Raina, and Y. Robert, “Efficient Checkpoint/Verification Patterns,” International Journal on High Performance Computing Applications, July 2015.

(392.76 KB)

Haidar, A., P. Luszczek, S. Tomov, and J. Dongarra, “Efficient Eigensolver Algorithms on Accelerator Based Architectures,” 2015 SIAM Conference on Applied Linear Algebra (SIAM LA), Atlanta, GA, SIAM, October 2015.

(6.98 MB)

Solcà, R., A. Kozhevnikov, A. Haidar, S. Tomov, T. C. Schulthess, and J. Dongarra, “Efficient Implementation Of Quantum Materials Simulations On Distributed CPU-GPU Systems,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.

(1.09 MB)

Anzt, H., S. Tomov, and J. Dongarra, “Energy Efficiency and Performance Frontiers for Sparse Computations on GPU Supercomputers,” Sixth International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM '15), San Francisco, CA, ACM, February 2015.

(2.29 MB)

Reed, D., and J. Dongarra, “ Exascale Computing and Big Data,” Communications of the ACM, vol. 58, no. 7: ACM, pp. 56-68, July 2015.

(7.3 MB)

Anzt, H., B. Haugen, J. Kurzak, P. Luszczek, and J. Dongarra, “Experiences in autotuning matrix multiplication for energy minimization on GPUs,” Concurrency in Computation: Practice and Experience, vol. 27, issue 17, pp. 5096-5113, December 2015.

(1.98 MB)

Anzt, H., B. Haugen, J. Kurzak, P. Luszczek, and J. Dongarra, “Experiences in Autotuning Matrix Multiplication for Energy Minimization on GPUs,” Concurrency and Computation: Practice and Experience, vol. 27, issue 17, pp. 5096 - 5113, Oct 12, 2015.

(1.99 MB)

2016

Anzt, H., J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, “Efficiency of General Krylov Methods on GPUs – An Experimental Study,” 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 683-691, May 2016.

Anzt, H., J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, “Efficiency of General Krylov Methods on GPUs – An Experimental Study,” The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), Chicago, IL, IEEE, May 2016.

(285.28 KB)

Main menu

Publications

Pages