Publications

Export 1041 results:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
K
Kasichayanula, K., D. Terpstra, P. Luszczek, S. Tomov, S. Moore, and G. D. Peterson, Power Aware Computing on GPUs,” SAAHPC '12 (Best Paper Award), Argonne, IL, July 2012.  (658.06 KB)
Kaya, O., and Y. Robert, Computing Dense Tensor Decompositions with Optimal Dimension Trees,” Algorithmica, vol. 81, issue 5, pp. 2092–2121, May 2019. DOI: 10.1007/s00453-018-0525-3  (638.4 KB)
Kelleher, Jr., M., Development of the PICMSS NetSolve Service,” ICL Technical Report, no. ICL-UT-02-04, April 2002.  (328.44 KB)
Recent Advances in the Message Passing Interface, Lecture Notes in Computer Science (LNCS),” EuroMPI 2010 Proceedings, vol. 6305, Stuttgart, Germany, Springer, September 2010.
Keller, R., G. Bosilca, G. Fagg, M. Resch, and J. Dongarra, Implementation and Usage of the PERUSE-Interface in Open MPI,” Euro PVM/MPI 2006, Bonn, Germany, September 2006.  (310.76 KB)
Kennedy, K., J. Mellor-Crummey, K. Cooper, L. Torczon, F. Berman, A. Chien, D. Angulo, I. Foster, D. Gannon, L. Johnsson, et al., Toward a Framework for Preparing and Executing Adaptive Grid Programs,” International Parallel and Distributed Processing Symposium: IPDPS 2002 Workshops, Fort Lauderdale, FL, pp. 0171, April 2002.  (64.5 KB)
Kennedy, K., B. Broom, K. Cooper, J. Dongarra, R. Fowler, D. Gannon, L. Johnsson, J. Mellor-Crummey, and L. Torczon, Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated Libraries,” Journal of Parallel and Distributed Computing, vol. 61, no. 12, pp. 1803-1826, December 2001.  (386.37 KB)
Kurzak, J., M. Gates, I. Yamazaki, A. Charara, A. YarKhan, J. Finney, G. Ragghianti, P. Luszczek, and J. Dongarra, Linear Systems Performance Report,” SLATE Working Notes, no. 8, ICL-UT-18-08: Innovative Computing Laboratory, University of Tennessee, September 2018.  (1.64 MB)
Kurzak, J., A. Buttari, and J. Dongarra, Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization,” IEEE Transactions on Parallel and Distributed Systems, vol. 19, no. 9, pp. 1-11, January 2008.  (751.57 KB)
Kurzak, J., A. Buttari, and J. Dongarra, Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization,” UT Computer Science Technical Report (Also LAPACK Working Note 184), no. UT-CS-07-596, January 2007.  (751.57 KB)
Kurzak, J., and J. Dongarra, Implementation of Mixed Precision in Solving Systems of Linear Equations on the Cell Processor,” Concurrency and Computation: Practice and Experience, vol. 19, no. 10, pp. 1371-1385, July 2007.  (453.78 KB)
Kurzak, J., Y. Tsai, M. Gates, A. Abdelfattah, and J. Dongarra, Massively Parallel Automated Software Tuning,” 48th International Conference on Parallel Processing (ICPP 2019), Kyoto, Japan, ACM Press, August 2019.
Kurzak, J., P. Luszczek, M. Faverge, and J. Dongarra, Programming the LU Factorization for a Multicore System with Accelerators,” Proceedings of VECPAR’12, Kobe, Japan, April 2012.  (414.33 KB)
Kurzak, J., and J. Dongarra, Implementation of the Mixed-Precision High Performance LINPACK Benchmark on the CELL Processor,” University of Tennessee Computer Science Tech Report, no. UT-CS-06-580, LAPACK Working Note #177, September 2006.  (506.18 KB)
Kurzak, J., A. Buttari, P. Luszczek, and J. Dongarra, The PlayStation 3 for High Performance Scientific Computing,” Computing in Science and Engineering, pp. 80-83, January 2008.  (2.45 MB)
Kurzak, J., H. Ltaeif, J. Dongarra, and R. M. Badia, Scheduling Dense Linear Algebra Operations on Multicore Processors,” Concurrency and Computation: Practice and Experience, vol. 22, no. 1, pp. 15-44, January 2010.  (1.23 MB)
Kurzak, J., S. Tomov, and J. Dongarra, Autotuning GEMMs for Fermi,” University of Tennessee Computer Science Technical Report, UT-CS-11-671, (also Lawn 245), April 2011.  (397.45 KB)
Kurzak, J., P. Luszczek, and J. Dongarra, LU Factorization with Partial Pivoting for a Multicore System with Accelerators,” IEEE Transactions on Parallel and Distributed Computing, vol. 24, issue 8, pp. 1613-1621, August 2013. DOI: http://doi.ieeecomputersociety.org/10.1109/TPDS.2012.242  (1.08 MB)
Kurzak, J., P. Wu, M. Gates, I. Yamazaki, P. Luszczek, G. Ragghianti, and J. Dongarra, Designing SLATE: Software for Linear Algebra Targeting Exascale,” SLATE Working Notes, no. 3, ICL-UT-17-06: Innovative Computing Laboratory, University of Tennessee, October 2017.  (2.8 MB)
Kurzak, J., and J. Dongarra, QR Factorization for the CELL Processor,” Scientific Programming, vol. 17, no. 1-2, pp. 31-42, 00 2010.  (194.95 KB)
Kurzak, J., M. Gates, A. Charara, A. YarKhan, and J. Dongarra, SLATE Working Note 12: Implementing Matrix Inversions,” SLATE Working Notes, no. 12, ICL-UT-19-04: Innovative Computing Laboratory, University of Tennessee, June 2019.  (1.95 MB)
Kurzak, J., H. Ltaeif, J. Dongarra, and R. M. Badia, Scheduling Linear Algebra Operations on Multicore Processors,” Concurrency Practice and Experience (to appear), 00 2009.  (716.18 KB)
Kurzak, J., and J. Dongarra, Implementing Linear Algebra Routines on Multi-Core Processors with Pipelining and a Look Ahead,” University of Tennessee Computer Science Tech Report, UT-CS-06-581, LAPACK Working Note #178, January 2006.  (304.4 KB)
Kurzak, J., P. Luszczek, M. Gates, I. Yamazaki, and J. Dongarra, Virtual Systolic Array for QR Decomposition,” 15th Workshop on Advances in Parallel and Distributed Computational Models, IEEE International Parallel & Distributed Processing Symposium (IPDPS 2013), Boston, MA, IEEE, May 2013. DOI: 10.1109/IPDPS.2013.119  (749.84 KB)
Kurzak, J., P. Luszczek, I. Yamazaki, Y. Robert, and J. Dongarra, Design and Implementation of the PULSAR Programming System for Large Scale Computing,” Supercomputing Frontiers and Innovations, vol. 4, issue 1, 2017. DOI: 10.14529/jsfi170101
Kurzak, J., A. Buttari, P. Luszczek, and J. Dongarra, The PlayStation 3 for High Performance Scientific Computing,” University of Tennessee Computer Science Technical Report, no. UT-CS-08-608, January 2008.  (2.45 MB)
Kurzak, J., H. Ltaeif, J. Dongarra, and R. M. Badia, Scheduling Linear Algebra Operations on Multicore Processors,” University of Tennessee Computer Science Department Technical Report, UT-CS-09-636 (Also LAPACK Working Note 213), 00 2009.  (716.18 KB)
Kurzak, J., M. Gates, A. YarKhan, I. Yamazaki, P. Luszczek, J. Finney, and J. Dongarra, Parallel Norms Performance Report,” SLATE Working Notes, no. 6, ICL-UT-18-06: Innovative Computing Laboratory, University of Tennessee, June 2018.  (1.13 MB)
Kurzak, J., and J. Dongarra, QR Factorization for the CELL Processor,” University of Tennessee Computer Science Technical Report, UT-CS-08-616 (also LAPACK Working Note 201), May 2008.  (194.95 KB)
Kurzak, J., H. Ltaeif, J. Dongarra, and R. M. Badia, Dependency-Driven Scheduling of Dense Matrix Factorizations on Shared-Memory Systems,” PPAM 2009, Poland, September 2009.
Kurzak, J., M. Gates, A. YarKhan, I. Yamazaki, P. Wu, P. Luszczek, J. Finney, and J. Dongarra, Parallel BLAS Performance Report,” SLATE Working Notes, no. 5, ICL-UT-18-01: University of Tennessee, April 2018.  (4.39 MB)
Kurzak, J., P. Luszczek, S. Tomov, and J. Dongarra, Preliminary Results of Autotuning GEMM Kernels for the NVIDIA Kepler Architecture,” LAWN 267, 00 2012.  (1.14 MB)
Kurzak, J., P. Luszczek, A. YarKhan, M. Faverge, J. Langou, H. Bouwmeester, and J. Dongarra, Multithreading in the PLASMA Library,” Multi and Many-Core Processing: Architecture, Programming, Algorithms, & Applications: Taylor & Francis, 00 2013.  (536.28 KB)
Kurzak, J., and J. Dongarra, QR Factorization for the CELL Processor,” Scientific Programming (to appear), 00 2009.  (234.02 KB)
Kurzak, J., H. Anzt, M. Gates, and J. Dongarra, Implementation and Tuning of Batched Cholesky Factorization and Solve for NVIDIA GPUs,” IEEE Transactions on Parallel and Distributed Systems, no. 1045-9219, November 2015.
Kurzak, J., and J. Dongarra, Fully Dynamic Scheduler for Numerical Computing on Multicore Processors,” University of Tennessee Computer Science Department Technical Report, UT-CS-09-643 (Also LAPACK Working Note 220), 00 2009.  (488.24 KB)
Kurzak, J., R. Nath, P. Du, and J. Dongarra, An Implementation of the Tile QR Factorization for a GPU and Multiple CPUs,” Applied Parallel and Scientific Computing, vol. 7133, pp. 248-257, 00 2012.  (623.5 KB)
L
Lacoste, X., M. Faverge, P. Ramet, S. Thibault, and G. Bosilca, Taking Advantage of Hybrid Systems for Sparse Direct Solvers via Task-Based Runtimes,” 23rd International Heterogeneity in Computing Workshop, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (807.33 KB)
Langou, J., B. Hoffman, and B. King, How LAPACK library enables Microsoft Visual Studio support with CMake and LAPACKE,” University of Tennessee Computer Science Technical Report (also LAWN 270), no. UT-CS-12-698, July 2012.  (501.53 KB)
Langou, J., J. Langou, P. Luszczek, J. Kurzak, A. Buttari, and J. Dongarra, Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy,” University of Tennessee Computer Science Tech Report, no. UT-CS-06-574, LAPACK Working Note #175, April 2006.  (221.39 KB)
Langou, J., Z. Chen, G. Bosilca, and J. Dongarra, Recovery Patterns for Iterative Methods in a Parallel Unstable Environment,” SIAM SISC (to appear), May 2007.  (241.36 KB)
Langou, J., and J. Dongarra, The Problem with the Linpack Benchmark Matrix Generator,” International Journal of High Performance Computing Applications, vol. 23, no. 1, pp. 5-14, 00 2009.  (136.41 KB)
,” 15th European PVM/MPI Users' Group Meeting, Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science, vol. 5205, Dublin Ireland, Springer Berlin, January 2008.
Le Fèvre, V., T. Herault, Y. Robert, A. Bouteiller, A. Hori, G. Bosilca, and J. Dongarra, Comparing the Performance of Rigid, Moldable, and Grid-Shaped Applications on Failure-Prone HPC Platforms,” Parallel Computing, vol. 85, pp. 1–12, July 2019. DOI: 10.1016/j.parco.2019.02.002  (865.18 KB)
Le Fèvre, V., G. Bosilca, A. Bouteiller, T. Herault, A. Hori, Y. Robert, and J. Dongarra, Do moldable applications perform better on failure-prone HPC platforms?,” 11th Workshop on Resiliency in High Performance Computing in Clusters, Clouds, and Grids, Turin, Italy, Springer Verlag, August 2018.  (360.72 KB)
Lee, DW., and J. Dongarra, VisPerf: Monitoring Tool for Grid Computing,” Lecture Notes in Computer Science, vol. 2659: Springer Verlag, Heidelberg, pp. 233-243, 00 2003.  (835.09 KB)
Lemariner, P., G. Bosilca, C. Coti, T. Herault, and J. Dongarra, Constructing Resilient Communication Infrastructure for Runtime Environments,” ParCo 2009, Lyon France, September 2009.
Li, Y., J. Dongarra, and S. Tomov, A Note on Auto-tuning GEMM for GPUs,” 9th International Conference on Computational Science (ICCS 2009), no. 5544-5545, Baton Rouge, LA, pp. 884-892, May 2009. DOI: 10.1007/978-3-642-01970-8_89  (236.02 KB)
Li, Y., J. Dongarra, K. Seymour, and A. YarKhan, Request Sequencing: Enabling Workflow for Efficient Problem Solving in GridSolve,” International Conference on Grid and Cooperative Computing (GCC 2008) (submitted), Shenzhen, China, October 2008.  (1.64 MB)
Li, Y., and J. Dongarra, Request Sequencing: Enabling Workflow for Efficient Parallel Problem Solving in GridSolve,” ICL Technical Report, no. ICL-UT-08-01, April 2008.  (1.64 MB)

Pages