Publications

Show only items where

Author

Type

Term

Year

Keyword

Export 1274 results:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Tomov, S., J. Langou, A. Canning, L-W. Wang, and J. Dongarra, “Conjugate-Gradient Eigenvalue Solvers in Computing Electronic Properties of Nanostructure Architectures,” International Journal of Computational Science and Engineering (to appear), January 2005.

(428.21 KB)

Tomov, S., W. Lu, J. Bernholc, S. Moore, and J. Dongarra, “Performance Evaluation for Petascale Quantum Simulation Tools,” Proceedings of the Cray Users' Group Meeting, Atlanta, GA, May 2010.

Tomov, S., and J. Dongarra, The Future of Computing: Software Libraries , Savannah, GA, DOD CREATE Developers' Review, Keynote Presentation, February 2012.

(6.76 MB)

Tomov, S., R. Nath, and J. Dongarra, “Accelerating the Reduction to Upper Hessenberg, Tridiagonal, and Bidiagonal Forms through Hybrid GPU-Based Computing,” Parallel Computing, vol. 36, no. 12, pp. 645-654, 00 2010.

(1.39 MB)

Tomov, S., K. Wong, J. Dongarra, R. Archibald, E. Chow, E. D'Azevedo, M. Eisenbach, R. Febbo, F. Lopez, D. Nichols, et al., Integrating Deep Learning in Domain Science at Exascale (MagmaDNN) , virtual, DOD HPCMP seminar, December 2020.

(11.12 MB)

“Recent Advances in the Message Passing Interface: 19th European MPI Users' Group Meeting, EuroMPI 2012,” Lecture Notes in Computer Science, vol. 7490, Vienna, Austria, 00 2012.

Tsai, Y-H. Mike, N. Beams, and H. Anzt, “Three-precision algebraic multigrid on GPUs,” Future Generation Computer Systems, July 2023.

Tsai, Y. M., P. Luszczek, and J. Dongarra, “Mixed-Precision Algorithm for Finding Selected Eigenvalues and Eigenvectors of Symmetric and Hermitian Matrices,” ICL Technical Report, no. ICL-UT-21-05, August 2021.

(3.93 MB)

Tsai, Y. M., T. Cojean, and H. Anzt, “Sparse Linear Algebra on AMD and NVIDIA GPUs—The Race is On,” ISC High Performance: Springer, June 2020.

(5.63 MB)

Tsai, Y-H. M., T. Cojean, and H. Anzt, “Providing performance portable numerics for Intel GPUs,” Concurrency and Computation: Practice and Experience, vol. 17, October 2022.

(3.16 MB)

Tsai, Y., P. Luszczek, and J. Dongarra, Using Quantized Integer in LU Factorization with Partial Pivoting (Poster) , Seattle, WA, SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP20), February 2020.

(6.65 MB)

Tseng, S-M., B. Nicolae, G. Bosilca, E. Jeannot, A. Chandramowlishwaran, and F. Cappello, “Towards Portable Online Prediction of Network Utilization Using MPI-Level Monitoring,” 2019 European Conference on Parallel Processing (Euro-Par 2019), Göttingen, Germany, Springer, August 2019.

(1.07 MB)

Turchenko, V., L. Grandinetti, G. Bosilca, and J. Dongarra, “Improvement of parallelization efficiency of batch pattern BP training algorithm using Open MPI,” Proceedings of International Conference on Computational Science, ICCS 2010 (to appear), Amsterdam The Netherlands, Elsevier, June 2010.

(125.01 KB)

Turchenko, V., G. Bosilca, A. Bouteiller, and J. Dongarra, “Efficient Parallelization of Batch Pattern Training Algorithm on Many-core and Cluster Architectures,” 7th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems, Berlin, Germany, September 2013.

(102.51 KB)

Vadhiyar, S., “A Performance Oriented Migration Framework for the Grid,” Proceedings of the 3rd International Symposium on Cluster Computing and the Grid, Tokyo, Japan, pp. 130-137, May 2003.

(113.6 KB)

Vadhiyar, S., and J. Dongarra, “GrADSolve - A Grid-based RPC System for Remote Invocation of Parallel Software,” Journal of Parallel and Distributed Computing (submitted), March 2003.

(241.3 KB)

Vadhiyar, S., and J. Dongarra, “Self Adaptivity in Grid Computing,” Concurrency and Computation: Practice and Experience, Special Issue: Grid Performance, vol. 17, no. 2-4, pp. 235-257, 00 2005.

(394.66 KB)

Vadhiyar, S., G. Fagg, and J. Dongarra, “Automatically Tuned Collective Communications,” Proceedings of SuperComputing 2000 (SC'2000), Dallas, TX, November 2000.

(232.69 KB)

Vadhiyar, S., G. Fagg, and J. Dongarra, “Towards an Accurate Model for Collective Communications,” ICL Technical Report, no. ICL-UT-05-03, January 2005.

(250.73 KB)

Vadhiyar, S., and J. Dongarra, “SRS - A Framework for Developing Malleable and Migratable Parallel Software,” Parallel Processing Letters, vol. 13, no. 2, pp. 291-312, June 2003.

(211.6 KB)

Vadhiyar, S., J. Dongarra, and A. YarKhan, “GrADSolve - RPC for High Performance Computing on the Grid,” Lecture Notes in Computer Science, Proceedings of the 9th International Euro-Par Conference, vol. 2790, Klagenfurt, Austria, Springer-Verlag, Berlin, pp. 394-403, January 2003.

(125.96 KB)

Vadhiyar, S., G. Fagg, and J. Dongarra, “Performance Modeling for Self Adapting Collective Communications for MPI,” LACSI Symposium 2001, Santa Fe, NM, October 2001.

(105.49 KB)

Vadhiyar, S., and J. Dongarra, “Self Adaptability in Grid Computing,” Concurrency: Practice and Experience (submitted), March 2003.

(258.89 KB)

Vadhiyar, S., G. Fagg, and J. Dongarra, “Towards an Accurate Model for Collective Communications,” International Journal of High Performance Applications, Special Issue: Automatic Performance Tuning, vol. 18, no. 1, pp. 159-167, January 2004.

(250.73 KB)

Vadhiyar, S., and J. Dongarra, “A Metascheduler For The Grid,” Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC 2002), Edinburgh, Scotland, IEEE Computer Society, pp. 343-351, July 2002.

(99.53 KB)

Valeev, E. F., R. J. Harrison, A. A. Holmes, C. C. Peterson, and D. A. Penchoff, “Direct Determination of Optimal Real-Space Orbitals for Correlated Electronic Structure of Molecules,” Journal of Chemical Theory and Computation, vol. 19, issue 20, pp. 7230 - 7241, October 2023.

Valero-Lara, P., J. Dongarra, A. Haidar, S. D. Relton, S. Tomov, and M. Zounon, A Standard for Batched BLAS Routines , Paris, France, 17th SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP16), April 2016.

(1.93 MB)

Vetter, J., R. Glassbrook, K. Schwan, S. Yalamanchili, M. Horton, A. Gavrilovska, M. Slawinska, J. Dongarra, J. Meredith, P. Roth, et al., “Keeneland: Computational Science Using Heterogeneous GPU Computing,” Contemporary High Performance Computing: From Petascale Toward Exascale, Boca Raton, FL, Taylor and Francis, 2013.

(2.7 MB)

Vetter, J., R. Glassbrook, J. Dongarra, K. Schwan, B. Loftis, S. McNally, J. Meredith, J. Rogers, P. Roth, K. Spafford, et al., “Keeneland: Bringing Heterogeneous GPU Computing to the Computational Science Community,” IEEE Computing in Science & Engineering, vol. 13, issue 5, pp. 90-95, August 2011.

(932.57 KB)

Voemel, C., S. Tomov, and J. Dongarra, “Divide and Conquer on Hybrid GPU-Accelerated Multicore Systems,” SIAM Journal on Scientific Computing, vol. 34(2), pp. C70-C82, April 2012.

Voemel, C., S. Tomov, and J. Dongarra, “Divide & Conquer on Hybrid GPU-Accelerated Multicore Systems,” SIAM Journal on Scientific Computing (submitted), August 2010.

Voemel, C., S. Tomov, O. Marques, A. Canning, L-W. Wang, and J. Dongarra, “State-of-the-Art Eigensolvers for Electronic Structure Calculations of Large Scale Nano-Systems,” Journal of Computational Physics, vol. 227, no. 15, pp. 7113-7124, January 2008.

Voemel, C., S. Tomov, L-W. Wang, O. Marques, and J. Dongarra, “The Use of Bulk States to Accelerate the Band Edge State Calculation of a Semiconductor Quantum Dot,” Journal of Computational Physics, vol. 223, pp. 774-782, 00 2007.

(452.6 KB)

Voemel, C., S. Tomov, L-W. Wang, O. Marques, and J. Dongarra, “The use of bulk states to accelerate the band edge state calculation of a semiconductor quantum dot,” Journal of Computational Physics (submitted), January 2006.

(337.08 KB)

Wang, L., W. Wu, J. Zhang, H. Liu, G. Bosilca, M. Herlihy, and R. Fonseca, “FFT-Based Gradient Sparsification for the Distributed Training of Deep Neural Networks,” 9th International Symposium on High-Performance Parallel and Distributed Computing (HPDC 20), Stockholm, Sweden, ACM, June 2020.

(4.72 MB)

Wang, Y., M. Baboulin, J. Falcou, Y. Fraigneau, and O. Le Maître, “A Parallel Solver for Incompressible Fluid Flows,” International Conference on Computational Science (ICCS 2013), Barcelona, Spain, Elsevier B.V., June 2013.

(588.79 KB)

Weaver, V. M., and J. Dongarra, “Can Hardware Performance Counters Produce Expected, Deterministic Results?,” 3rd Workshop on Functionality of Hardware Performance Monitoring, Atlanta, GA, December 2010.

(392.71 KB)

Weaver, V. M., M. Johnson, K. Kasichayanula, J. Ralph, P. Luszczek, D. Terpstra, and S. Moore, “Measuring Energy and Power with PAPI,” International Workshop on Power-Aware Systems and Architectures, Pittsburgh, PA, September 2012.

(146.79 KB)

Weaver, V., D. Terpstra, H. McCraw, M. Johnson, K. Kasichayanula, J. Ralph, J. Nelson, P. Mucci, T. Mohan, and S. Moore, PAPI 5: Measuring Power, Energy, and the Cloud , Austin, TX, 2013 IEEE International Symposium on Performance Analysis of Systems and Software, April 2013.

(78.39 KB)

Weaver, V., D. Terpstra, and S. Moore, “Non-Determinism and Overcount on Modern Hardware Performance Counter Implementations,” 2013 IEEE International Symposium on Performance Analysis of Systems and Software, Austin, TX, IEEE, April 2013.

(307.24 KB)

Whaley, C., A. Petitet, and J. Dongarra, “Automated Empirical Optimizations of Software and the ATLAS Project (LAPACK Working Note 147),” University of Tennessee Computer Science Department Technical Report,, no. UT-CS-00-448, September 2000.

(373.69 KB)

Whaley, C., and J. Dongarra, “Automatically Tuned Linear Algebra Software,” 1998 ACM/IEEE conference on Supercomputing (SC '98), Orlando, FL, IEEE Computer Society, November 1998.

Whaley, C., A. Petitet, and J. Dongarra, “Automated Empirical Optimization of Software and the ATLAS Project,” Parallel Computing, vol. 27, no. 1-2, pp. 3-25, January 2001.

(370.71 KB)

White, J. B., and J. Dongarra, “High-Performance High-Resolution Semi-Lagrangian Tracer Transport on a Sphere,” Journal of Computational Physics, vol. 230, issue 17, pp. 6778-6799, July 2011.

(1.68 MB)

White, J. B., and J. Dongarra, “Overlapping Computation and Communication for Advection on a Hybrid Parallel Computer,” IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 2011.

Whitlock, M., N. Morales, G. Bosilca, A. Bouteiller, B. Nicolae, K. Teranishi, E. Giem, and V. Sarkar, “Integrating process, control-flow, and data resiliency layers using a hybrid Fenix/Kokkos approach,” 2022 IEEE International Conference on Cluster Computing (CLUSTER 2022), Heidelberg, Germany, September 2022.

Winkler, F., “Redesigning PAPI's High-Level API,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-03: University of Tennessee, February 2020.

(356.41 KB)

Wolf, F., F. Freitag, B. Mohr, S. Moore, and B. Wylie, “Large Event Traces in Parallel Performance Analysis,” 8th Workshop 'Parallel Systems and Algorithms' (PASA), Lecture Notes in Informatics, no. ICL-UT-06-08, Frankfurt/Main, Germany, Gesellschaft für Informatik, March 2006.

(92.47 KB)

Wolf, F., B. Mohr, J. Dongarra, and S. Moore, “Efficient Pattern Search in Large Traces through Successive Refinement,” Proceedings of Euro-Par 2004, Pisa, Italy, Springer-Verlag, August 2004.

(177.46 KB)

Wolf, F., and B. Mohr, “Automatic performance analysis of hybrid MPI/OpenMP applications,” Journal of Systems Architecture, Special Issue 'Evolutions in parallel distributed and network-based processing', vol. 49(10-11): Elsevier, pp. 421-439, November 2003.

Main menu

Publications

Pages