Publications

Export 164 results:
Filters: Author is Piotr Luszczek  [Clear All Filters]
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
H
Du, P., P. Luszczek, and J. Dongarra, High Performance Dense Linear System Solver with Soft Error Resilience,” IEEE Cluster 2011, Austin, TX, September 2011.  (1.27 MB)
Du, P., P. Luszczek, and J. Dongarra, High Performance Dense Linear System Solver with Resilience to Multiple Soft Errors,” ICCS 2012, Omaha, NE, June 2012.  (1.27 MB)
Dongarra, J., M. A. Heroux, and P. Luszczek, High Performance Conjugate Gradient Benchmark: A new Metric for Ranking High Performance Computing Systems,” International Journal of High Performance Computing Applications, vol. 30, issue 1, pp. 3 - 10, February 2016. DOI: 10.1177/1094342015593158  (277.51 KB)
Ltaeif, H., P. Luszczek, and J. Dongarra, High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures,” University of Tennessee Computer Science Technical Report, UT-CS-11-673, (also Lawn 247), May 2011.  (424.93 KB)
Ltaeif, H., P. Luszczek, and J. Dongarra, High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures,” ACM Transactions on Mathematical Software (TOMS), vol. 39, issue 3, no. 16, 2013. DOI: 10.1145/2450153.2450154  (665.7 KB)
Newburn, C. J., G. Bansal, M. Wood, L. Crivelli, J. Planas, A. Duran, P. Souza, L. Borges, P. Luszczek, S. Tomov, et al., Heterogeneous Streaming,” The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2016, Chicago, IL, IEEE, May 2016.  (2.73 MB)
Haidar, A., P. Luszczek, S. Tomov, and J. Dongarra, Heterogeneous Acceleration for Linear Algebra in Mulit-Coprocessor Environments,” VECPAR 2014, Eugene, OR, June 2014.  (276.52 KB)
Jia, Y., P. Luszczek, and J. Dongarra, Hessenberg Reduction with Transient Error Resilience on GPU-Based Hybrid Architectures,” 30th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Chicago, IL, IEEE, May 2016.  (535.72 KB)
G
Anzt, H., P. Luszczek, J. Dongarra, and V. Heuveline, GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement,” University of Tennessee Computer Science Technical Report UT-CS-11-690 (also Lawn 260), December 2011.  (662.98 KB)
Anzt, H., P. Luszczek, J. Dongarra, and V. Heuveline, GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement,” EuroPar 2012 (also LAWN 260), Rhodes Island, Greece, August 2012.  (662.98 KB)
F
Du, P., R. Weber, P. Luszczek, S. Tomov, G. D. Peterson, and J. Dongarra, From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming,” Parallel Computing, vol. 38, no. 8, pp. 391-407, August 2012.  (1.64 MB)
Haidar, A., T. Dong, S. Tomov, P. Luszczek, and J. Dongarra, Framework for Batched and GPU-resident Factorization Algorithms to Block Householder Transformations,” ISC High Performance, Frankfurt, Germany, Springer, July 2015.  (778.26 KB)
Haidar, A., A. YarKhan, C. Cao, P. Luszczek, S. Tomov, and J. Dongarra, Flexible Linear Algebra Development and Scheduling with Cholesky Factorization,” 17th IEEE International Conference on High Performance Computing and Communications, Newark, NJ, August 2015.  (494.31 KB)
Bosilca, G., A. Bouteiller, A. Danalis, M. Faverge, A. Haidar, T. Herault, J. Kurzak, J. Langou, P. Lemariner, H. Ltaeif, et al., Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA,” Proceedings of the Workshops of the 25th IEEE International Symposium on Parallel and Distributed Processing (IPDPS 2011 Workshops), Anchorage, Alaska, USA, IEEE, pp. 1432-1441, May 2011.  (1.26 MB)
E
Langou, J., J. Langou, P. Luszczek, J. Kurzak, A. Buttari, and J. Dongarra, Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy,” University of Tennessee Computer Science Tech Report, no. UT-CS-06-574, LAPACK Working Note #175, April 2006.  (221.39 KB)
Buttari, A., J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek, and S. Tomov, Exploiting Mixed Precision Floating Point Hardware in Scientific Computations,” In High Performance Computing and Grids in Action (to appear), Amsterdam, IOS Press, 00 2007.  (122.01 KB)
Buttari, A., J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek, and S. Tomov, Exploiting Mixed Precision Floating Point Hardware in Scientific Computations,” in High Performance Computing and Grids in Action, Amsterdam, IOS Press, January 2008.  (92.95 KB)
Dongarra, J., M. Faverge, H. Ltaeif, and P. Luszczek, Exploiting Fine-Grain Parallelism in Recursive LU Factorization,” Proceedings of PARCO'11, no. ICL-UT-11-04, Gent, Belgium, April 2011.
Anzt, H., B. Haugen, J. Kurzak, P. Luszczek, and J. Dongarra, Experiences in autotuning matrix multiplication for energy minimization on GPUs,” Concurrency in Computation: Practice and Experience, vol. 27, issue 17, pp. 5096-5113, December 2015. DOI: 10.1002/cpe.3516  (1.98 MB)
Anzt, H., B. Haugen, J. Kurzak, P. Luszczek, and J. Dongarra, Experiences in Autotuning Matrix Multiplication for Energy Minimization on GPUs,” Concurrency and Computation: Practice and Experience, vol. 27, issue 17, pp. 5096 - 5113, Oct 12, 2015. DOI: 10.1002/cpe.3516  (1.99 MB)
Luszczek, P., E. Meek, S. Moore, D. Terpstra, V. M. Weaver, and J. Dongarra, Evaluation of the HPC Challenge Benchmarks in Virtualized Environments,” 6th Workshop on Virtualization in High-Performance Cloud Computing, Bordeaux, France, August 2011.  (114.73 KB)
Ltaeif, H., P. Luszczek, and J. Dongarra, Enhancing Parallelism of Tile Bidiagonal Transformation on Multicore Architectures using Tree Reduction,” Lecture Notes in Computer Science, vol. 7203, pp. 661-670, September 2012.  (185.77 KB)
Dongarra, J., H. Ltaeif, P. Luszczek, and V. M. Weaver, Energy Footprint of Advanced Dense Numerical Linear Algebra using Tile Algorithms on Multicore Architecture,” The 2nd International Conference on Cloud and Green Computing (submitted), Xiangtan, Hunan, China, November 2012.  (329.5 KB)
Haidar, A., P. Luszczek, S. Tomov, and J. Dongarra, Efficient Eigensolver Algorithms on Accelerator Based Architectures,” 2015 SIAM Conference on Applied Linear Algebra (SIAM LA), Atlanta, GA, SIAM, October 2015.  (6.98 MB)
D
Zaitsev, D., and P. Luszczek, Docker Container based PaaS Cloud Computing Comprehensive Benchmarks using LAPACK,” Computer Modeling and Intelligent Systems CMIS-2020, Zaporizhzhoa, March 2020.  (451.33 KB)
Bosilca, G., A. Bouteiller, A. Danalis, M. Faverge, A. Haidar, T. Herault, J. Kurzak, J. Langou, P. Lemariner, H. Ltaeif, et al., Distributed-Memory Task Execution and Dependence Tracking within DAGuE and the DPLASMA Project,” Innovative Computing Laboratory Technical Report, no. ICL-UT-10-02, 00 2010.  (400.75 KB)
Bosilca, G., A. Bouteiller, A. Danalis, M. Faverge, A. Haidar, T. Herault, J. Kurzak, J. Langou, P. Lemariner, H. Ltaeif, et al., Distributed Dense Numerical Linear Algebra Algorithms on Massively Parallel Architectures: DPLASMA,” University of Tennessee Computer Science Technical Report, UT-CS-10-660, September 2010.  (366.26 KB)
Kurzak, J., P. Wu, M. Gates, I. Yamazaki, P. Luszczek, G. Ragghianti, and J. Dongarra, Designing SLATE: Software for Linear Algebra Targeting Exascale,” SLATE Working Notes, no. 3, ICL-UT-17-06: Innovative Computing Laboratory, University of Tennessee, October 2017.  (2.8 MB)
Luszczek, P., and J. Dongarra, Design of an Interactive Environment for Numerically Intensive Parallel Linear Algebra Calculations,” International Conference on Computational Science, Poland, Springer Verlag, June 2004. DOI: 10.1007/978-3-540-25944-2_35  (88.31 KB)
Kurzak, J., P. Luszczek, I. Yamazaki, Y. Robert, and J. Dongarra, Design and Implementation of the PULSAR Programming System for Large Scale Computing,” Supercomputing Frontiers and Innovations, vol. 4, issue 1, 2017. DOI: 10.14529/jsfi170101  (764.96 KB)
Yamazaki, I., J. Kurzak, P. Luszczek, and J. Dongarra, Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime,” Workshop on Large-Scale Parallel Processing, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (398.16 KB)
Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Luszczek, and J. Dongarra, Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach,” Scalable Computing and Communications: Theory and Practice: John Wiley & Sons, pp. 699-735, March 2013.  (1.01 MB)
Dongarra, J., J. Kurzak, P. Luszczek, and S. Tomov, Dense Linear Algebra on Accelerated Multicore Hardware,” High Performance Scientific Computing: Algorithms and Applications, London, UK, Springer-Verlag, 00 2012.
Dongarra, J., R. Graybill, W. Harrod, R. Lucas, E. Lusk, P. Luszczek, J. McMahon, A. Snavely, J. Vetter, K. Yelick, et al., DARPA's HPCS Program: History, Models, Tools, Languages,” in Advances in Computers, vol. 72: Elsevier, January 2008.  (3.61 MB)
C
Agarwal, P., R. A.. Alexander, E.. Apra, S. Balay, A. S. Bland, J. Colgan, E. D'Azevedo, J. Dongarra, T. Dunigan, M. Fahey, et al., Cray X1 Evaluation Status Report,” Oak Ridge National Laboratory Report, vol. /-2004/13, January 2004.  (817.33 KB)
Jia, Y., P. Luszczek, G. Bosilca, and J. Dongarra, CPU-GPU Hybrid Bidiagonal Reduction With Soft Error Resilience,” ScalA '13 Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Montpellier, France, November 2013.  (238.58 KB)
Haidar, A., H. Ltaeif, P. Luszczek, and J. Dongarra, A Comprehensive Study of Task Coalescing for Selecting Parallelism Granularity in a Two-Stage Bidiagonal Reduction,” IPDPS 2012, Shanghai, China, May 2012.  (480.43 KB)
Yamazaki, I., M. Hoemmen, P. Luszczek, and J. Dongarra, Comparing performance of s-step and pipelined GMRES on distributed-memory multicore CPUs , Pittsburgh, Pennsylvania, SIAM Annual Meeting, July 2017.  (748 KB)
Pei, Y., Q. Cao, G. Bosilca, P. Luszczek, V. Eijkhout, and J. Dongarra, Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based Runtime,” 21st IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2020), New Orleans, LA, IEEE, May 2020.  (1.33 MB)
Antoniu, G., A. Costan, O. Marcu, M. S. Pérez, N. Stojanovic, R. M. Badia, M. Vázquez, S. Girona, M. Beck, T. Moore, et al., A Collection of White Papers from the BDEC2 Workshop in Poznan, Poland,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-10: University of Tennessee, Knoxville, May 2019.  (5.82 MB)
Cao, C., J. Dongarra, P. Du, M. Gates, P. Luszczek, and S. Tomov, clMAGMA: High Performance Dense Linear Algebra with OpenCL ,” International Workshop on OpenCL, Bristol University, England, May 2014.  (460.91 KB)
Cao, C., J. Dongarra, P. Du, M. Gates, P. Luszczek, and S. Tomov, clMAGMA: High Performance Dense Linear Algebra with OpenCL,” University of Tennessee Technical Report (Lawn 275), no. UT-CS-13-706: University of Tennessee, March 2013.  (526.6 KB)
YarKhan, A., A. Haidar, C. Cao, P. Luszczek, S. Tomov, and J. Dongarra, Cholesky Across Accelerators,” 17th IEEE International Conference on High Performance Computing and Communications (HPCC 2015), Elizabeth, NJ, IEEE, August 2015.
Luszczek, P., J. Kurzak, and J. Dongarra, Changes in Dense Linear Algebra Kernels - Decades Long Perspective,” in Solving the Schrodinger Equation: Has everything been tried? (to appear): Imperial College Press, 00 2011.
Fayad, D., J. Kurzak, P. Luszczek, P. Wu, and J. Dongarra, The Case for Directive Programming for Accelerator Autotuner Optimization,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-07: University of Tennessee, October 2017.  (341.52 KB)
Gates, M., P. Luszczek, A. Abdelfattah, J. Kurzak, J. Dongarra, K. Arturov, C. Cecka, and C. Freitag, C++ API for BLAS and LAPACK,” SLATE Working Notes, no. 2, ICL-UT-17-03: Innovative Computing Laboratory, University of Tennessee, June 2017.  (1.12 MB)
Abdelfattah, A., K. Arturov, C. Cecka, J. Dongarra, C. Freitag, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, et al., C++ API for Batch BLAS,” SLATE Working Notes, no. 4, ICL-UT-17-12: University of Tennessee, December 2017.  (1.89 MB)

Pages