Publications

Export 954 results:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
T
Hoefler, T., Y-S. Dai, and J. Dongarra, Towards Efficient MapReduce Using MPI,” Lecture Notes in Computer Science, Recent Advances in Parallel Virtual Machine and Message Passing Interface - 16th European PVM/MPI Users' Group Meeting, vol. 5759, Espoo, Finland, Springer Berlin / Heidelberg, pp. 240-249, 00-2009.
Benoit, A., A. Cavelan, V. Le Fèvre, Y. Robert, and H. Sun, Towards Optimal Multi-Level Checkpointing,” IEEE Transactions on Computers, vol. 66, issue 7, pp. 1212–1226, July 2017.  (1.39 MB)
Wolf, F., A. Maloney, S. Shende, and A. Morris, Trace-Based Parallel Performance Overhead Compensation,” In Proc. of the International Conference on High Performance Computing and Communications (HPCC), Sorrento (Naples), Italy, September 2005.  (306.88 KB)
Jagode, H., A. Knuepfer, J. Dongarra, M. Jurenz, M. S. Mueller, and W. E. Nagel, Trace-based Performance Analysis for the Petascale Simulation Code FLASH,” International Journal of High Performance Computing Applications (to appear), 00-2010.  (887.54 KB)
Jagode, H., A. Knuepfer, J. Dongarra, M. Jurenz, M. S. Mueller, and W. E. Nagel, Trace-based Performance Analysis for the Petascale Simulation Code FLASH,” Innovative Computing Laboratory Technical Report, no. ICL-UT-09-01, April 2009.  (887.54 KB)
Jia, Y., P. Luszczek, and J. Dongarra, Transient Error Resilient Hessenberg Reduction on GPU-based Hybrid Architectures,” UT-CS-13-712: University of Tennessee Computer Science Technical Report, June 2013.  (206.42 KB)
Seymour, K., A. YarKhan, and J. Dongarra, Transparent Cross-Platform Access to Software Services using GridSolve and GridRPC,” in Cloud Computing and Software Services: Theory and Techniques (to appear): CRC Press, 00-2009.
Dongarra, J., Trends in High Performance Computing,” The Computer Journal, vol. 47, no. 4: The British Computer Society, pp. 399-403, 00-2004.  (455.96 KB)
Dongarra, J., A Tribute to Gene Golub,” Computing in Science and Engineering: IEEE, pp. 5, January 2008.
Yamazaki, I., T. Dong, R. Solcà, S. Tomov, J. Dongarra, and T. C. Schulthess, Tridiagonalization of a dense symmetric matrix on multiple GPUs and its application to symmetric eigenvalue problems,” Concurrency and Computation: Practice and Experience, October 2013.  (1.71 MB)
Hiroyasu, T., M. Miki, H. Shimosaka, M. Sano, Y. Tanimura, Y. Mimura, S. Yoshimura, and J. Dongarra, Truss Structural Optimization Using NetSolve System,” Meeting of the Japan Society of Mechanical Engineers, Kyoto University, Kyoto, Japan, October 2002.  (450.65 KB)
Du, P., M. Parsons, E. Fuentes, S-L. Shaw, and J. Dongarra, Tuning Principal Component Analysis for GRASS GIS on Multi-core and GPU Architectures,” FOSS4G 2010, Barcelona, Spain, September 2010.  (1.57 MB)
Anzt, H., J. Dongarra, and E. S. Quintana-Ortí, Tuning Stationary Iterative Solvers for Fault Resilience,” 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA15), Austin, TX, ACM, November 2015.  (1.28 MB)
Dongarra, J., G. H. Golub, E. Grosse, C. Moler, and K. Moore, Twenty-Plus Years of Netlib and NA-Net,” University of Tennessee Computer Science Department Technical Report, UT-CS-04-526, 00-2006.  (62.79 KB)
Luszczek, P., H. Ltaeif, and J. Dongarra, Two-stage Tridiagonal Reduction for Dense Symmetric Matrices using Tile Algorithms on Multicore Architectures,” IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 2011.
U
Shamis, P.., M.. G. Venkata, M.. G. Lopez, M.. B. Baker, O.. Hernandez, Y.. Itigin, M.. Dubman, G.. Shainer, R.. L. Graham, L.. Liss, et al., UCX: An Open Source Framework for HPC Network APIs and Beyond,” 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, Santa Clara, CA, USA, IEEE, pp. 40-43, Aug, 2015.
Haidar, A., C. Cao, J. Dongarra, P. Luszczek, and S. Tomov, Unified Development for Mixed Multi-GPU and Multi-Coprocessor Environments using a Lightweight Runtime Environment,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (1.51 MB)
Bosilca, G., A. Bouteiller, T. Herault, P. Lemariner, N. Ohm Saengpatsa, S. Tomov, and J. Dongarra, A Unified HPC Environment for Hybrid Manycore/GPU Distributed Systems,” IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 2011.
Bosilca, G., A. Bouteiller, E. Brunet, F. Cappello, J. Dongarra, A. Guermouche, T. Herault, Y. Robert, F. Vivien, and D. Zaidouni, Unified Model for Assessing Checkpointing Protocols at Extreme-Scale,” University of Tennessee Computer Science Technical Report (also LAWN 269), no. UT-CS-12-697, June 2012.  (2.76 MB)
Bosilca, G., A. Bouteiller, E. Brunet, F. Cappello, J. Dongarra, A. Guermouche, T. Herault, Y. Robert, F. Vivien, and D. Zaidouni, Unified Model for Assessing Checkpointing Protocols at Extreme-Scale,” Concurrency and Computation: Practice and Experience, November 2013.  (894.61 KB)
Aliaga, J. I., H. Anzt, M. Castillo, J. C. Fernández, G. León, J. Pérez, and E. S. Quintana-Ortí, Unveiling the Performance-energy Trade-off in Iterative Linear System Solvers for Multithreaded Processors,” Concurrency and Computation: Practice and Experience, vol. 27, issue 4, pp. 885-904, September 2014.  (1.83 MB)
Blackford, S., J. Demmel, J. Dongarra, I. Duff, S. Hammarling, G. Henry, M. Heroux, L. Kaufman, A. Lumsdaine, A. Petitet, et al., An Updated Set of Basic Linear Algebra Subprograms (BLAS),” ACM Transactions on Mathematical Software, vol. 28, no. 2, pp. 135-151, December 2002.  (228.33 KB)
Anzt, H., E. Chow, J. Saak, and J. Dongarra, Updating Incomplete Factorization Preconditioners for Model Order Reduction,” Numerical Algorithms, vol. 73, issue 3, no. 3, pp. 611–630, February 2016.  (565.34 KB)
Wolf, F., B. Wylie, E. Abraham, W. Frings, K. Fürlinger, M. Geimer, M-A. Hermanns, B. Mohr, S. Moore, and M. Pfeifer, Usage of the Scalasca Toolset for Scalable Performance Analysis of Large-scale Parallel Applications,” Proceedings of the 2nd International Workshop on Tools for High Performance Computing, Stuttgart, Germany, Springer, pp. 157-167, January 2008.  (229.2 KB)
Voemel, C., S. Tomov, L-W. Wang, O. Marques, and J. Dongarra, The Use of Bulk States to Accelerate the Band Edge State Calculation of a Semiconductor Quantum Dot,” Journal of Computational Physics, vol. 223, pp. 774-782, 00-2007.  (452.6 KB)
Voemel, C., S. Tomov, L-W. Wang, O. Marques, and J. Dongarra, The use of bulk states to accelerate the band edge state calculation of a semiconductor quantum dot,” Journal of Computational Physics (submitted), January 2006.  (337.08 KB)
Bland, W., User Level Failure Mitigation in MPI,” Euro-Par 2012: Parallel Processing Workshops, vol. 7640, Rhodes Island, Greece, Springer Berlin Heidelberg, pp. 499-504, August 2012.  (136.15 KB)
Moore, S., and J. Ralph, User-defined Events for Hardware Performance Monitoring,” ICCS 2011 Workshop: Tools for Program Development and Analysis in Computational Science, Singapore, www.sciencedirect.com, June 2011.  (361.76 KB)
Agrawal, S., D. Arnold, S. Blackford, J. Dongarra, M. Miller, K. Sagi, Z. Shi, K. Seymour, and S. Vadhiyar, Users' Guide to NetSolve v1.4.1,” ICL Technical Report, no. ICL-UT-02-05, June 2002.  (328.01 KB)
Baboulin, M., and S. Gratton, Using dual techniques to derive componentwise and mixed condition numbers for a linear functional of a linear least squares solution,” University of Tennessee Computer Science Technical Report, UT-CS-08-622 (also LAPACK Working Note 207), January 2008.  (159.65 KB)
Haidar, A., S. Tomov, A. Abdelfattah, M. Zounon, and J. Dongarra, Using GPU FP16 Tensor Cores Arithmetic to Accelerate Mixed-Precision Iterative Refinement Solvers and Reduce Energy Consumption,” ISC High Performance (ISC'18), Best Poster, Frankfurt, Germany, June 2018.  (3.01 MB)
Fürlinger, K., J. Dongarra, and M. Gerndt, On Using Incremental Profiling for the Performance Analysis of Shared Memory Parallel Applications,” Proceedings of the 13th International Euro-Par Conference on Parallel Processing (Euro-Par '07), Rennes, France, Springer LNCS, January 2007.
Chow, E., H. Anzt, J. Scott, and J. Dongarra, Using Jacobi Iterations and Blocking for Solving Sparse Triangular Systems in Incomplete Factorization Preconditioning,” Journal of Parallel and Distributed Computing, vol. 119, pp. 219–230, November 2018.  (273.53 KB)
Tomov, S., M. Faverge, P. Luszczek, and J. Dongarra, Using MAGMA with PGI Fortran,” PGI Insider, November 2010.  (176.67 KB)
Buttari, A., J. Dongarra, J. Kurzak, P. Luszczek, and S. Tomov, Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy,” ACM Transactions on Mathematical Software, vol. 34, no. 4, pp. 17-22, 00-2008.  (364.48 KB)
Giraud, L., A. Haidar, and S. Pralet, Using multiple levels of parallelism to enhance the performance of domain decomposition solvers,” Parallel Computing, vol. 36, no. 5-6: Elsevier journals, pp. 285-296, 00-2010.  (418.57 KB)
Dongarra, J., K. London, S. Moore, P. Mucci, and D. Terpstra, Using PAPI for Hardware Performance Monitoring on Linux Systems,” Conference on Linux Clusters: The HPC Revolution, Urbana, Illinois, Linux Clusters Institute, June 2001.  (422.35 KB)
Eberius, D., T. Patinyasakdikul, and G. Bosilca, Using Software-Based Performance Counters to Expose Low-Level Open MPI Performance Information,” EuroMPI, Chicago, IL, ACM, September 2017.  (745.58 KB)
McCraw, H., A. Danalis, G. Bosilca, J. Dongarra, K. Kowalski, and T. Windus, Utilizing Dataflow-based Execution for Coupled Cluster Methods,” 2014 IEEE International Conference on Cluster Computing, no. ICL-UT-14-02, Madrid, Spain, IEEE, September 2014.  (260.23 KB)
V
Anzt, H., J. Dongarra, G. Flegar, and T. Gruetzmacher, Variable-Size Batched Condition Number Calculation on GPUs,” SBAC-PAD, Lyon, France, September 2018.  (509.3 KB)
Anzt, H., J. Dongarra, G. Flegar, E. S. Quintana-Ortí, and A. E. Thomas, Variable-Size Batched Gauss-Huard for Block-Jacobi Preconditioning,” International Conference on Computational Science (ICCS 2017), vol. 108, Zurich, Switzerland, Procedia Computer Science, pp. 1783-1792, June 2017.
Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Ortí, Variable-Size Batched Gauss–Jordan Elimination for Block-Jacobi Preconditioning on Graphics Processors,” Parallel Computing, January 2018.  (1.9 MB)
Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Ortí, Variable-Size Batched LU for Small Matrices and Its Integration into Block-Jacobi Preconditioning,” 46th International Conference on Parallel Processing (ICPP), Bristol, United Kingdom, IEEE, August 2017.
Ramakrishan, L., D. Nurmi, A. Mandal, C. Koelbel, D. Gannon, M. Huang, Y-S. Kee, G. Obertelli, K. Thyagaraja, R. Wolski, et al., VGrADS: Enabling e-Science Workflows on Grids and Clouds with Fault Tolerance,” SC’09 The International Conference for High Performance Computing, Networking, Storage and Analysis (to appear), Portland, OR, 00-2009.  (648.82 KB)
Casanova, H., T. Bartol, F. Berman, A. Birnbaum, J. Dongarra, M. Ellisman, M. Faerman, E. Gockay, M. Miller, G. Obertelli, et al., The Virtual Instrument: Support for Grid-enabled Scientific Simulations,” International Journal of High Performance Computing Applications, vol. 18, no. 1, pp. 3-17, January 2004.  (282.16 KB)
Casanova, H., T. Bartol, F. Berman, A. Birnbaum, J. Dongarra, M. Ellisman, M. Faerman, E. Gockay, M. Miller, G. Obertelli, et al., The Virtual Instrument: Support for Grid-enabled Scientific Simulations,” Journal of Parallel and Distributed Computing (submitted), October 2002.  (282.16 KB)
Kurzak, J., P. Luszczek, M. Gates, I. Yamazaki, and J. Dongarra, Virtual Systolic Array for QR Decomposition,” 15th Workshop on Advances in Parallel and Distributed Computational Models, IEEE International Parallel & Distributed Processing Symposium (IPDPS 2013), Boston, MA, IEEE, May 2013.  (749.84 KB)
Lee, DW., and J. Dongarra, VisPerf: Monitoring Tool for Grid Computing,” Lecture Notes in Computer Science, vol. 2659: Springer Verlag, Heidelberg, pp. 233-243, 00-2003.  (835.09 KB)
Haugen, B., S. Richmond, J. Kurzak, C. A. Steed, and J. Dongarra, Visualizing Execution Traces with Task Dependencies,” 2nd Workshop on Visual Performance Analysis (VPA '15), Austin, TX, ACM, November 2015.  (927.5 KB)
Fürlinger, K., and S. Moore, Visualizing the Program Execution Control Flow of OpenMP Applications,” Proc. 4th International Workshop on OpenMP (IWOMP 2008), West Lafayette, Indiana, Lecture Notes in Computer Science 5004, pp. 181-190, January 2008.  (194.25 KB)

Pages