piotr luszczek :: publications
2015

Performance of Random Sampling for Computing Lowrank Approximations of a Dense Matrix on GPUs,
Th&eaccute;o Mary,
Ichitaro Yamazaki,
Jakub Kurzak,
Piotr Luszczek,
Stanimire Tomov,
Jack Dongarra,
Proceedings of SC15, Austin, TX, USA, November 1520, 2015.
[paper, slides]
2014

Unified Development for Mixed MultiGPU and MultiCoprocessor Environments using a Lightweight Runtime Environment,
Azzam Haidar, Chongxiao Cao, Asim YarKhan, Piotr Luszczek, Stanimire Tomov, Khairul Kabir, Jack Dongarra,
Proceedings of 28th IEEE International Parallel & Distributed Processing Symposium May 1923, 2014 Arizona Grand Resort PHOENIX (Arizona) USA
[PDF]
2013

An Improved Parallel Singular Value Algorithm and Its Implementation for Multicore Hardware,
Azzam Haidar, Piotr Luszczek, Jakub Kurzak,
Proceedings of SC13, November 1721, 2013, Denver, CO, USA; 10.1145/2503210.2503292
University of Tennessee Technical Report
[PDF]

BlackjackBench: Portable Hardware Characterization with Automated Results' Analysis,
Anthony Danalis, Piotr Luszczek, Gabriel Marin, Jeffrey S. Vetter, Jack Dongarra, The Computer Journal, first published online
June 28, 2013. DOI: 10.1093/comjnl/bxt057
[PDF]
2012

Energy Footprint of Advanced Dense Numerical Linear Algebra using Tile Algorithms on Multicore
Architecture,
Jack Dongarra, Hatem Ltaief, Piotr Luszczek, and Vince M. Weaver,
submitted to The 2nd International Conference on Cloud and
Green Computing(CGC 2012) November 13, 2012, Xiangtan, Hunan, China.
[PDF]

BlackjackBench}: portable hardware characterization
Danalis, Anthony and Luszczek, Piotr and Marin, Gabriel and Vetter, Jeffrey S. and Dongarra, Jack,
SIGMETRICS Perform. Eval. Rev. 40(2), pp 7479, 2012. ISSN 01635999. DOI: 10.1145/2381056.2381074
[PDF]

Anatomy of a Globally Recursive Embedded LINPACK Benchmark
Piotr Luszczek and Jack Dongarra, accepted in 2012 IEEE
High Performance Extreme Computing Conference (HPEC 2012), Westin Hotel, Waltham, Massachusetts, September 1012, 2012.
IEEE Catalog Number: CFP12HPECDR, ISBN: 9781467315746.
[PDF  slides]

From CUDA to OpenCL: Towards a Performanceportable Solution for Multiplatform GPU Programming
Peng Du, Rick Weber, Piotr Luszczek, Stanimire Tomov, Gregory Peterson, Jack Dongarra
Parallel Computing, Vol. 38 No. 8, pages 391407, August 2012.
10.1016/j.parco.2011.10.002, ISSN: 01678191
[PDF 
publisher page ]

Programming the LU Factorization for a Multicore System with Accelerators
Jakub Kurzak, Piotr Luszczek, Mathieu Faverage, and Jack Dongarra, In Proceedings of
VECPAR 2012
Kobe, Japan, July 1720, 2012. Springer LNCS ????
[PDF]

Recent Advances in Dense Matrix Computations for TwoSided Reduction Algorithms,
Hatem Ltaief, Azzam Haidar, Piotr Luszczek, and Jack Dongarra,
Presentation at 7th International Workshop on Parallel Matrix Algorithms and Applications (PMAA 2012),
Birkbeck University of London, UK, 2830 June 2012.
[PDF]

High Performance Dense Linear System Solver with Resilience to Multiple Soft Errors,
Peng Du, Piotr Luszczek, and Jack Dongarra,
International Conference on Computational Science, ICCS 2012, Omaha NE.
[PDF]
2011

Exploiting FineGrain Parallelism in Recursive LU Factorization
Jack Dongarra, Mathieu Faverge, Hatem Ltaief, and Piotr Luszczek
In Proceedings of ParCo 2011 30 August  2 September 2011, Ghent, Belgium
[extended abstract PDF]

Profiling High Performance Dense Linear Algebra Algorithms on Multicore Architectures for Power and Energy Efficiency
Hatem Ltaief, Piotr Luszczek, Jack Dongarra
In Proceedings International Conference on EnergyAware High Performance Computing, September 0709, 2011,
Hamburg, Germany (University of Tennessee Technical Report
utcs11674,
LAWN 251)
[PDF]

High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures
Hatem Ltaief, Piotr Luszczek, and Jack Dongarra
UTK CS Technical Report utcs11673. Submitted to ACM TOMS.
[PDF]

TwoStage Tridiagonal Reduction for Dense Symmetric Matrices using Tile Algorithms on Multicore Architectures
Piotr Luszczek, Hatem Ltaief, and Jack Dongarra
Proceedings of IPDPS 2011 Anchorage, Alaska, May 1620 2011.
[PDF]
2010

MixedTool Performance Analysis on Hybrid Multicore Architectures
Peng Du, Piotr Luszczek, Stanimire Tomov, Jack Dongarra
ICPPW '10: Proceedings of the 2010 39th International Conference on Parallel Processing Workshops
Publisher: IEEE Computer Society,
San Diego, California,
September 1316, 2010
[PDF]

From CUDA to OpenCL: Towards a Performanceportable Solution for Multiplatform GPU Programming
Peng Du, Rick Weber, Piotr Luszczek, Stanimire Tomov, Gregory Peterson, Jack Dongarra
Technical Report UTCS10656, University of Tennessee, Computer Science Department, LAPACK Working Note 228
[PDF]

Reducing the time to tune parallel dense linear algebra routines with partial execution and performance modelling
Jack Dongarra and Piotr Luszczek, Proceedings of SC10, New Orleans, Louisianna, USA, November 1319, 2010; Also: Technical Report UTCS10661, University of Tennessee, Computer Science Department
[PDF  PDF poster]

Analysis of Various Scalar, Vector, and Parallel Implementations of RandomAccess
Piotr Luszczek and Jack Dongarra, Innovative Computing Laboratory (ICL) Technical Report, ICLUT1003, June, 2010.
[PDF]

DistributedMemory Task Execution and Dependence Tracking within DAGuE and the DPLASMA Project
George Bosilca, Aurelien Bouteiller, Anthony Danalis, Mathieu Faverge, Azzam Haidar, Thomas Herault, Jakub Kurzak,
Julien Langou, Pierre Lemarinier, Hatem Ltaief, Piotr Luszczek, Asim YarKhan, Jack Dongarra,
ICL Technical Report, ICLUT1002, April 2010.
[PDF]
2009
 Improving Performance of Sparse Numerical Linear Algebra Computations: Algorithmic optimization techniques for sparse direct
and sparse iterative numerical solvers of large linear equations

Piotr Luszczek

LAP Lambert Academic Publishing,
ISBN10: 3838334698,
ISBN13: 9783838334691,
(January 11, 2010)

[available at Amazon]
 Parallel Programming in MATLAB

Piotr Luszczek

The International Journal of High Performance Computing Applications,
Volume 23 Issue 3, pages 277283, July 2009

[PDF]
2007
 High Performance Development for High End Computing with Python
Language Wrapper (PLW)

Piotr Luszczek and Jack Dongarra

The International Journal of High Performance Computing Applications,
Volume 21, No. 2, Summer 2007
 The Impact of Multicore on Math Software

Alfredo Buttari, Jack Dongarra, Jakub Kurzak, Julien Langou,
Piotr Luszczek, Stan Tomov
Proceedings of PARA'06, Umeå, Sweden, June 1821, 2006.
 Using Mixed Precision for Sparse Matrix Computations to Enhance
the Performance while Achieving 64bit Accuracy

Alfredo Buttari, Jack Dongarra, Jakub Kurzak, Piotr Luszczek, Stan Tomov
Submitted to ACM TOMS, 2007.
2006
 Design and Implementation of the HPCC Benchmark Suite

Piotr Luszczek, Jack Dongarra, and Jeremy Kepner
 CT Watch Quarterly, 2(4A), November 2006

[BibTeX]
 Exploiting the Performance of 32 bit Floating Point Arithmetic
in Obtaining 64 bit Accuracy
 Julie Langou, Julien Langou, Piotr Luszczek, Jakub Kurzak, Alfredo
Buttari and Jack Dongarra
 UTK CS Tech Report, CS06574, LAPACK Working Note #175, April 2006.

Accepted to SC06.

[PDF]
 High Performance Development for High End Computing with Python
Language Wrapper (PLW)

Piotr Luszczek and Jack Dongarra

Accepted to IJHPCA.

[PDF]
 Self adapting numerical software (SANS) effort

Jack Dongarra, George Bosilca, Zizhong Chen, Victor Eijkhout, Graham E. Fagg,
Erika Fuentes, Julien Langou, Piotr Luszczek, Jelena PjesivacGrbovic, Keith Seymour, Haihang You, Sathish S. Vadhiyar
 IBM Journal of Research and Development 50, pp. 223238, March, 2006.
 [PDF
 BibTeX ]
 Overview of the HPCC Challenge Benchmark Suite
 Piotr Luszczek and Jack Dongarra
 SPEC Benchmark Workshop 2006, January 23, 2006
 Thompson Conference Center, University of Texas, Austin
 [HTML]
2005
 Introduction to the HPC Challenge Benchmark Suite
 Piotr Luszczek, Jack J. Dongarra, David Koester, Rolf Rabenseifner,
Bob Lucas, Jeremy Kepner, John McCalpin, David Bailey,
and Daisuke Takahashi

[PDF]
 HPC Challenge v1.x Benchmark Suite
 David Koester and Piotr Luszczek
 Technical Program Tutorial, Noveber 13, 2006
 Washington State Convention and Trade Center, Seattle, WA
 [HTML]
 Introduction to the HPC Challenge Benchmark Suite

Jack Dongarra, Piotr Luszczek

ICL Technical Report, ICLUT0501

CS Dept. Tech Report UTCS05544, 2005

[PDF]
2004
 LAPACK for Clusters Project: An Example of Self Adapting Numerical
Software
 Zizhong Chen, Jack Dongarra, Piotr Luszczek, and Kenneth
Roche
 Hawaii International Conference on System Sciences
HICSS37,
Hilton Waikoloa Village, Big Island, Hawaii, January 58, 2004.
 [ PDF ]
 Cray X1 Evaluation Status Report
 Agarwal, P., Alexander, R., Apra, E., Balay, S., Bland, A.,
Colgan, J., D'Azevedo, E., Dongarra, J., Dunigan, T., Fahey, M., Fahey, R.,
Geist, A., Gordon, M., Harrison, R., Kaushik, D., Krishnakumar, M.,
Luszczek, P., Mezzacapa, T., Nichols, J., Nieplocha, J., Oliker, L.,
Packwood, T., Pindzola, M., Schulthess, T., Vetter, J., White, J.,
Windus, T., Worley, P., Zacharia, T.
 Oak Ridge National Laboratory Report, ORNL/TM2004/13, January 2004.

[PDF]
 Design of Interactive Environment for Numerically Intensive
Parallel Linear Algebra Calculations
 Piotr Luszczek and Jack Dongarra
 Proceedings of the 4th International Conference on Computational Science,
Krakow, Poland, June 69, 2004
 Lecture Notes in Computer Science 3039, SpringerVerlag
BerlinHeidelberg, 2004, pp. 270277, ISBN 3540221298
 [PDF  BibTeX]
2003
 SelfAdapting Software for Numerical Linear Algebra and LAPACK for
Clusters
 Zizhong Chen, Jack Dongarra, Piotr Luszczek, and Kenneth Roche
 Parallel Computing Journal, NovemberDecember 2003, ISSN 01678191
 LAPACK Working Note: 160
 University of Tennessee CS Technical Report Number: utcs03499
 [ PDF  ScienceDirect
 BibTeX ]
 The LINPACK Benchmark: Past, Present, and Future
 Jack J. Dongarra and Piotr Luszczek and Antoine Petitet
 Concurrency and Computation: Practice and Experience 15, pp. 118, 2003
 [ PDF  BibTeX ]
 A Runtime Support for LargeScale Irregular Computing on Clusters
and Grids
 Brezany, P., Bubak, M., Luszczek, P., Malawski, M., Zajac, K.
 Annual Review of Scalable Computing, in: Kwong, Y. C. (E ds.), Series on
Scalable Computing, vol. 5, Singapore University Press and World Scientific,
2003, pp. 3064
 Self Adaptive Software for Numerical Linear Algebra Library
Routines on Clusters
 Zizhong Chen, Jack Dongarra, Piotr Luszczek, and Kenneth Roche
 presented at Workshop on Parallel Linear Algebra,
WoPLA'03
 A Framework for CheckPointed FaultTolerant OutofCore Linear
Algebra
 Ed D'Azevedo and Piotr Luszczek
 presented at SIAM Conference on Computational Science and Engineering
(CSE03)
February 1013, 2003
Hyatt Regency Islandia Hotel and Marina, San Diego, CA
 [ PDF ]
Performance Improvements of Common Sparse Numerical Linear Algebra
Computations
Piotr Luszczek
Ph.D. Thesis, May 2003, University of Tennessee Knoxville
[ HTML ]
20021999
 Marian Bubak, Dawid Kurzyniec, Piotr Luszczek, Vaidy S. Sunderam,
"Creating Java to Native Code Interfaces with Janet",
Scientific Programming 9(1): 3950 (2001)
 H.Y. Lin and P. Luszczek, "Tuning LINPACK N*N for PARISC Platforms",
Presented at High Performance Computing on HewlettPackard Systems
Conference, Bremen, Germany, October 79, 2001.
(PDF, gzipped
PostScript,
BibTeX, more)
 M. Bubak, D. Kurzyniec, and P. Luszczek, "Convenient use of legacy software
in Java with Janet package", Future Generation Computer Systems 17(8),
pp. 987997, June 2001.
(PDF,
gzipped PostScript,
BibTeX)
Abstract
As Java becomes an appropriate environment for high performance computing, the
interest arises in combining it with existing code written in other
languages. Portable Java interfaces to native code may be developed using the
Java Native Interface (JNI). However, as a lowlevel API it is rather
inconvenient to be used directly, thus the higher level tools and techniques
are desired. We present Janet  a highly expressive Java language extension
and preprocessing tool that enables convenient integration of native code with
Java programs.
 J. Dongarra, V. Eijkhout, and P. Luszczek, "Recursive approach in sparse
matrix LU factorization", Scientific Programming 9(1), pp. 5160, 2001. (Also:
In Proceedings of the 1st SGI Users Conference, pp. 409418, Cracow, Poland,
October 2000; ACC Cyfronet UMM.)
(PDF,
gzipped PostScript,
BibTeX)
Abstract
This paper describes a recursive method for the LU factorization of sparse
matrices. The recursive formulation of common linear algebra codes has been
proven very successful in dense matrix computations. An extension of the
recursive technique for sparse matrices is presented. Performance results
given here
show that the recursive approach may perform comparable to leading software
packages for sparse matrix factorization in terms of execution time, memory
usage, and error estimates of the solution.
 M. Bubak, P. Luszczek, "Towards Portable Runtime Support for Irregular and
OutofCore Computations", in: Dongarra, J., Luque, E., Margalef, T., (Eds.),
"Recent Advances in Parallel Virtual Machine and Message Passing Interface"
Proceedings of 6th European PVM/MPI Users' Group Meeting, Barcelona, Spain,
September 1999, Lecture Notes in Computer Science 1697, SpringerVerlag
BerlinHeidelberg, 1999, pp. 5966.
(gzipped PostScript,
BibTeX)
Abstract
In this paper, we present the lip
 a run time system which
enables easy and portable parallelization of irregular and outofcore
computations. Functions for handling irregular data were developed using
the same concept as in the CHAOS library. Outofcore parallelization is based
on the idea of incore section, and functions for outofcore data are
implemented with capabilities provided by MPIIO. The new library may be used
in C, Fortran and Java programs. Results of performance tests for a generic
irregular outofcore program on HP S2000 are presented and possible further
extensions are discussed.