Publications

Export 997 results:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
A
Dongarra, J., G. Bosilca, R. Delmas, and J. Langou, Algorithmic Based Fault Tolerance Applied to High Performance Computing,” Journal of Parallel and Distributed Computing, vol. 69, pp. 410-416, 00-2009.  (313.55 KB)
Bosilca, G., R. Delmas, J. Dongarra, and J. Langou, Algorithmic Based Fault Tolerance Applied to High Performance Computing,” University of Tennessee Computer Science Technical Report, UT-CS-08-620 (also LAPACK Working Note 205), January 2008.  (313.55 KB)
Boulet, P., J. Dongarra, F. Rastello, Y. Robert, and F. Vivien, Algorithmic Issues on Heterogeneous Computing Platforms,” Parallel Processing Letters, vol. 9, no. 2, pp. 197-213, January 1999.  (301.17 KB)
Petitet, A., and J. Dongarra, Algorithmic Redistribution Methods for Block Cyclic Decompositions,” IEEE Transactions on Parallel and Distributed Computing, vol. 10, no. 12, pp. 201-220, October 2002.  (524.82 KB)
Donfack, S., J. Dongarra, M. Faverge, M. Gates, J. Kurzak, P. Luszczek, and I. Yamazaki, On Algorithmic Variants of Parallel Gaussian Elimination: Comparison of Implementations in Terms of Performance and Numerical Properties,” University of Tennessee Computer Science Technical Report, no. UT-CS-13-715, July 2013, 2012.  (358.98 KB)
Masliah, I., A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, and J. Dongarra, Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,” Parallel Computing, vol. 81, pp. 1–21, January 2019.
Masliah, I., A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, and J. Dongarra, Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-09: Innovative Computing Laboratory, University of Tennessee, September 2018.  (3.74 MB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs,” IEEE Transactions on Parallel and Distributed Systems, vol. 29, issue 12, pp. 2700–2712, December 2018.
Andersson, U., and P. Mucci, Analysis and Optimization of Yee_Bench using Hardware Performance Counters,” Proceedings of Parallel Computing 2005 (ParCo) (to appear), Malaga, Spain, January 2005.  (72.27 KB)
Haidar, A., H. Ltaeif, A. YarKhan, and J. Dongarra, Analysis of Dynamically Scheduled Tile Algorithms for Dense Linear Algebra on Multicore Architectures,” University of Tennessee Computer Science Technical Report, UT-CS-11-666, (also Lawn 243), 00-2011.  (1.65 MB)
Haidar, A., H. Ltaeif, A. YarKhan, and J. Dongarra, Analysis of Dynamically Scheduled Tile Algorithms for Dense Linear Algebra on Multicore Architectures,” Submitted to Concurrency and Computations: Practice and Experience, November 2010.  (1.65 MB)
Luszczek, P., and J. Dongarra, Analysis of Various Scalar, Vector, and Parallel Implementations of RandomAccess,” Innovative Computing Laboratory (ICL) Technical Report, no. ICL-UT-10-03, June 2010.  (226.9 KB)
Song, F., S. Moore, and J. Dongarra, Analytical Modeling and Optimization for Affinity Based Thread Scheduling on Multicore Systems,” IEEE Cluster 2009, New Orleans, August 2009.  (395.53 KB)
Song, F., S. Moore, and J. Dongarra, Analytical Modeling for Affinity-Based Thread Scheduling on Multicore Platforms,” University of Tennessee Computer Science Technical Report, UT-CS-08-626, January 2008.  (650.75 KB)
Nelson, J., Analyzing PAPI Performance on Virtual Machines,” VMWare Technical Journal, vol. Winter 2013, January 2014.
Nelson, J., Analyzing PAPI Performance on Virtual Machines,” ICL Technical Report, no. ICL-UT-13-02, August 2013.  (437.37 KB)
Yamazaki, I., A. Abdelfattah, A. Ida, S. Ohshima, S. Tomov, R. Yokota, and J. Dongarra, Analyzing Performance of BiCGStab with Hierarchical Matrix on GPU Clusters,” IEEE International Parallel and Distributed Processing Symposium (IPDPS), Vancouver, BC, Canada, IEEE, May 2018.  (1.37 MB)
Luszczek, P., and J. Dongarra, Anatomy of a Globally Recursive Embedded LINPACK Benchmark,” 2012 IEEE High Performance Extreme Computing Conference, Waltham, MA, pp. 1-6, September 2012.  (204.74 KB)
Bhowmick, S., V. Eijkhout, Y. Freund, E. Fuentes, and D. Keyes, Application of Machine Learning to the Selection of Sparse Linear Solvers,” International Journal of High Performance Computing Applications (submitted), 00-2006.  (392.96 KB)
Eidson, T., J. Dongarra, and V. Eijkhout, Applying Aspect-Oriented Programming Concepts to a Component-based Programming Model,” IPDPS 2003, Workshop on NSF-Next Generation Software, Nice, France, March 2003.  (66.99 KB)
Ribizel, T., and H. Anzt, Approximate and Exact Selection on GPUs,” 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, IEEE, 2019.  (440.71 KB)
Anzt, H., and G. Flegar, Are we Doing the Right Thing? — A Critical Analysis of the Academic HPC Community,” 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, IEEE, 2019.  (622.32 KB)
Seo, S., A. Amer, P. Balaji, C. Bordage, G. Bosilca, A. Brooks, P. Carns, A. Castello, D. Genet, T. Herault, et al., Argobots: A Lightweight Low-Level Threading and Tasking Framework,” IEEE Transactions on Parallel and Distributed Systems, October 2017.
Genet, D., A. Guermouche, and G. Bosilca, Assembly Operations for Multicore Architectures using Task-Based Runtime Systems,” Euro-Par 2014, Porto, Portugal, Springer International Publishing, August 2014.  (481.52 KB)
Benoit, A., A. Cavelan, Y. Robert, and H. Sun, Assessing General-purpose Algorithms to Cope with Fail-stop and Silent Errors,” ACM Transactions on Parallel Computing, August 2016.  (573.71 KB)
Herrmann, J., G. Bosilca, T. Herault, L. Marchal, Y. Robert, and J. Dongarra, Assessing the Cost of Redistribution followed by a Computational Kernel: Complexity and Performance Results,” Parallel Computing, vol. 52, pp. 22-41, February 2016.  (2.06 MB)
Bosilca, G., A. Bouteiller, T. Herault, Y. Robert, and J. Dongarra, Assessing the Impact of ABFT and Checkpoint Composite Strategies,” 16th Workshop on Advances in Parallel and Distributed Computational Models, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (1.02 MB)
Bosilca, G., A. Bouteiller, T. Herault, Y. Robert, and J. Dongarra, Assessing the impact of ABFT and Checkpoint composite strategies,” University of Tennessee Computer Science Technical Report, no. ICL-UT-13-03, 2013.  (968.47 KB)
Aupy, G., Y. Robert, and F. Vivien, Assuming failure independence: are we right to be wrong?,” The 3rd International Workshop on Fault Tolerant Systems (FTS), Honolulu, Hawaii, IEEE, September 2017.  (597.11 KB)
Emad, N., S. A. S. Fazeli, and J. Dongarra, An Asynchronous Algorithm on NetSolve Global Computing System,” PRiSM - Laboratoire de recherche en informatique, Université de Versailles St-Quentin Technical Report, March 2004.  (377.33 KB)
Chow, E., H. Anzt, and J. Dongarra, Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs,” International Supercomputing Conference (ISC 2015), Frankfurt, Germany, July 2015.
Berry, M., and J. Dongarra, Atlanta Organizers Put Mathematics to Work For the Math Sciences Community,” SIAM News, vol. 32, no. 6, January 1999.  (45.98 KB)
Seymour, K., H. You, and J. Dongarra, ATLAS on the BlueGene/L – Preliminary Results,” ICL Technical Report, no. ICL-UT-06-10, January 2006.  (46.19 KB)
Whaley, C., A. Petitet, and J. Dongarra, Automated Empirical Optimization of Software and the ATLAS Project,” Parallel Computing, vol. 27, no. 1-2, pp. 3-25, January 2001.  (370.71 KB)
Whaley, C., A. Petitet, and J. Dongarra, Automated Empirical Optimizations of Software and the ATLAS Project (LAPACK Working Note 147),” University of Tennessee Computer Science Department Technical Report,, no. UT-CS-00-448, September 2000.  (373.69 KB)
You, H., K. Seymour, J. Dongarra, and S. Moore, Automated Empirical Tuning of a Multiresolution Analysis Kernel,” ICL Technical Report, no. ICL-UT-07-01, pp. 10, January 2007.  (120.7 KB)
Wolf, F., B. Mohr, J. Dongarra, and S. Moore, Automatic analysis of inefficiency patterns in parallel applications,” Concurrency and Computation: Practice and Experience, Special issue "Automatic Performance Analysis" (submitted), 00-2005.  (233.31 KB)
Wolf, F., B. Mohr, J. Dongarra, and S. Moore, Automatic Analysis of Inefficiency Patterns in Parallel Applications,” Concurrency and Computation: Practice and Experience, vol. 19, no. 11, pp. 1481-1496, August 2007.  (233.31 KB)
Yi, Q., K. Kennedy, H. You, K. Seymour, and J. Dongarra, Automatic Blocking of QR and LU Factorizations for Locality,” 2nd ACM SIGPLAN Workshop on Memory System Performance (MSP 2004), Washington, DC, June 2004.  (212.77 KB)
Eijkhout, V., Automatic Determination of Matrix-Blocks,” Lapack Working Note 151, University of Tennessee Computer Science Technical Report, no. UT-CS-01-458, January 2001.  (1.15 MB)
Bhatia, N., F. Song, F. Wolf, J. Dongarra, B. Mohr, and S. Moore, Automatic Experimental Analysis of Communication Patterns in Virtual Topologies,” In Proceedings of the International Conference on Parallel Processing, Oslo, Norway, IEEE Computer Society, June 2005.  (227.13 KB)
Cuenca, J., D. Giminez, J. González, J. Dongarra, and K. Roche, Automatic Optimisation of Parallel Linear Algebra Routines in Systems with Variable Load,” EuroPar 2002, Paderborn, Germany, August 2002.  (92.59 KB)
Wolf, F., and B. Mohr, Automatic performance analysis of hybrid MPI/OpenMP applications,” Journal of Systems Architecture, Special Issue 'Evolutions in parallel distributed and network-based processing', vol. 49(10-11): Elsevier, pp. 421-439, November 2003.
Seymour, K., and J. Dongarra, Automatic Translation of Fortran to JVM Bytecode,” Joint ACM Java Grande - ISCOPE 2001 Conference (submitted), Stanford University, California, June 2001.  (185.8 KB)
Seymour, K., and J. Dongarra, Automatic Translation of Fortran to JVM Bytecode,” Concurrency and Computation: Practice and Experience, vol. 15, no. 3-5, pp. 202-207, 00-2003.  (185.8 KB)
Vadhiyar, S., G. Fagg, and J. Dongarra, Automatically Tuned Collective Communications,” Proceedings of SuperComputing 2000 (SC'2000), Dallas, TX, November 2000.  (232.69 KB)
Whaley, C., and J. Dongarra, Automatically Tuned Linear Algebra Software,” 1998 ACM/IEEE conference on Supercomputing (SC '98), Orlando, FL, IEEE Computer Society, November 1998.
Mucci, P., J. Dongarra, R. Kufrin, S. Moore, F. Song, and F. Wolf, Automating the Large-Scale Collection and Analysis of Performance,” In Proceedings of the 5th LCI International Conference on Linux Clusters: The HPC Revolution, Austin, Texas, May 2004.  (511.6 KB)
You, H., B. Rekapalli, Q. Liu, and S. Moore, Autotuned Parallel I/O for Highly Scalable Biosequence Analysis,” TeraGrid'11, Salt Lake City, Utah, July 2011.  (275.34 KB)
Gates, M., J. Kurzak, P. Luszczek, Y. Pei, and J. Dongarra, Autotuning Batch Cholesky Factorization in CUDA with Interleaved Layout of Matrices,” Parallel and Distributed Processing Symposium Workshops (IPDPSW), Orlando, FL, IEEE, June 2017.

Pages