%0 Conference Paper %B 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) %D 2020 %T Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based Runtime %A Yu Pei %A Qinglei Cao %A George Bosilca %A Piotr Luszczek %A Victor Eijkhout %A Jack Dongarra %X Stencil computation or general sparse matrix-vector product (SpMV) are key components in many algorithms like geometric multigrid or Krylov solvers. But their low arithmetic intensity means that memory bandwidth and network latency will be the performance limiting factors. The current architectural trend favors computations over bandwidth, worsening the already unfavorable imbalance. Previous work approached stencil kernel optimization either by improving memory bandwidth usage or by providing a Communication Avoiding (CA) scheme to minimize network latency in repeated sparse vector multiplication by replicating remote work in order to delay communications on the critical path. Focusing on minimizing communication bottleneck in distributed stencil computation, in this study we combine a CA scheme with the computation and communication overlapping that is inherent in a dataflow task-based runtime system such as PaRSEC to demonstrate their combined benefits. We implemented the 2D five point stencil (Jacobi iteration) in PETSc, and over PaRSEC in two flavors, full communications (base-PaRSEC) and CA-PaRSEC which operate directly on a 2D compute grid. Our results running on two clusters, NaCL and Stampede2 indicate that we can achieve 2× speedup over the standard SpMV solution implemented in PETSc, and in certain cases when kernel execution is not dominating the execution time, the CA-PaRSEC version achieved up to 57% and 33% speedup over base-PaRSEC implementation on NaCL and Stampede2 respectively. %B 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) %I IEEE %C New Orleans, LA %8 2020-05 %G eng %R https://doi.org/10.1109/IPDPSW50202.2020.00127 %0 Generic %D 2007 %T Numerical Metadata API Reference %A Victor Eijkhout %K salsa %B Innovative Computing Laboratory Technical Report %8 2007-02 %G eng %0 Journal Article %J International Journal of High Performance Computing Applications (submitted) %D 2006 %T Application of Machine Learning to the Selection of Sparse Linear Solvers %A Sanjukta Bhowmick %A Victor Eijkhout %A Yoav Freund %A Erika Fuentes %A David Keyes %K salsa %K sans %B International Journal of High Performance Computing Applications (submitted) %8 2006-00 %G eng %0 Journal Article %J IBM Journal of Research and Development %D 2006 %T Self Adapting Numerical Software SANS Effort %A George Bosilca %A Zizhong Chen %A Jack Dongarra %A Victor Eijkhout %A Graham Fagg %A Erika Fuentes %A Julien Langou %A Piotr Luszczek %A Jelena Pjesivac–Grbovic %A Keith Seymour %A Haihang You %A Sathish Vadhiyar %K gco %B IBM Journal of Research and Development %V 50 %P 223-238 %8 2006-01 %G eng %0 Journal Article %J International Journal of Parallel Programming %D 2005 %T The Component Structure of a Self-Adapting Numerical Software System %A Victor Eijkhout %A Erika Fuentes %A Thomas Eidson %A Jack Dongarra %K salsa %K sans %B International Journal of Parallel Programming %V 33 %8 2005-06 %G eng %0 Conference Proceedings %B IPDPS 2004, NGS Workshop (to appear) %D 2004 %T Improvements in the Efficient Composition of Applications %A Thomas Eidson %A Victor Eijkhout %A Jack Dongarra %K salsa %K sans %B IPDPS 2004, NGS Workshop (to appear) %C Sante Fe %8 2004-00 %G eng %0 Generic %D 2004 %T Performance Optimization and Modeling of Blocked Sparse Kernels %A Alfredo Buttari %A Victor Eijkhout %A Julien Langou %A Salvatore Filippone %K sans %B ICL Technical Report %8 2004-00 %G eng %0 Conference Proceedings %B IEEE Proceedings (to appear) %D 2004 %T Self Adapting Linear Algebra Algorithms and Software %A James Demmel %A Jack Dongarra %A Victor Eijkhout %A Erika Fuentes %A Antoine Petitet %A Rich Vuduc %A Clint Whaley %A Katherine Yelick %K salsa %K sans %B IEEE Proceedings (to appear) %8 2004-00 %G eng %0 Conference Proceedings %B IPDPS 2003, Workshop on NSF-Next Generation Software %D 2003 %T Applying Aspect-Oriented Programming Concepts to a Component-based Programming Model %A Thomas Eidson %A Jack Dongarra %A Victor Eijkhout %K salsa %K sans %B IPDPS 2003, Workshop on NSF-Next Generation Software %C Nice, France %8 2003-03 %G eng %0 Generic %D 2003 %T Finite-choice Algorithm Optimization in Conjugate Gradients (LAPACK Working Note 159) %A Jack Dongarra %A Victor Eijkhout %B University of Tennessee Computer Science Technical Report, UT-CS-03-502 %8 2003-01 %G eng %0 Generic %D 2003 %T A Proposed Standard for Matrix Metadata %A Victor Eijkhout %A Erika Fuentes %K salsa %K sans %B Innovative Computing Laboratory Technical Report %C Submitted to ACM TOMS %8 2003-11 %G eng %0 Conference Proceedings %B DOE/NSF Workshop on New Directions in Cyber-Security in Large-Scale Networks: Development Obstacles %D 2003 %T Scalable, Trustworthy Network Computing Using Untrusted Intermediaries: A Position Paper %A Micah Beck %A Jack Dongarra %A Victor Eijkhout %A Mike Langston %A Terry Moore %A James Plank %K netsolve %B DOE/NSF Workshop on New Directions in Cyber-Security in Large-Scale Networks: Development Obstacles %C National Conference Center - Landsdowne, Virginia %8 2003-03 %G eng %0 Journal Article %J International Journal of High Performance Computing Applications %D 2003 %T Self Adapting Numerical Algorithm for Next Generation Applications %A Jack Dongarra %A Victor Eijkhout %K lacsi %K sans %B International Journal of High Performance Computing Applications %V 17 %P 125-132 %8 2003-01 %G eng %0 Journal Article %J Lecture Notes in Computer Science %D 2003 %T Self-Adapting Numerical Software and Automatic Tuning of Heuristics %A Jack Dongarra %A Victor Eijkhout %K salsa %K sans %B Lecture Notes in Computer Science %I Springer Verlag %C Melbourne, Australia %V 2660 %P 759-770 %8 2003-06 %G eng %0 Journal Article %J Scientific Programming (to appear) %D 2002 %T An Iterative Solver Benchmark %A Jack Dongarra %A Victor Eijkhout %A Henk van der Vorst %B Scientific Programming (to appear) %8 2002-00 %G eng %0 Generic %D 2002 %T Polynomial Acceleration of Optimised Multi-grid Smoothers; Basic Theory %A Victor Eijkhout %B ICL Technical Report %V 156 %8 2002-01 %G eng %0 Generic %D 2002 %T Self-adapting Numerical Software for Next Generation Applications (LAPACK Working Note 157) %A Jack Dongarra %A Victor Eijkhout %K salsa %K sans %B ICL Technical Report %8 2002-00 %G eng %0 Generic %D 2001 %T Automatic Determination of Matrix-Blocks %A Victor Eijkhout %B Lapack Working Note 151, University of Tennessee Computer Science Technical Report %8 2001-01 %G eng %0 Journal Article %J Scientific Programming %D 2001 %T Iterative Solver Benchmark (LAPACK Working Note 152) %A Jack Dongarra %A Victor Eijkhout %A Henk van der Vorst %B Scientific Programming %V 9 %P 223-231 %8 2001-00 %G eng %0 Journal Article %J Scientific Programming %D 2001 %T Recursive Approach in Sparse Matrix LU Factorization %A Jack Dongarra %A Victor Eijkhout %A Piotr Luszczek %B Scientific Programming %V 9 %P 51-60 %8 2001-00 %G eng %0 Conference Proceedings %B Proceedings of 1st SGI Users Conference %D 2000 %T Recursive approach in sparse matrix LU factorization %A Jack Dongarra %A Victor Eijkhout %A Piotr Luszczek %B Proceedings of 1st SGI Users Conference %C Cracow, Poland (ACC Cyfronet UMM, 2000) %P 409-418 %8 2000-01 %G eng %0 Conference Proceedings %B Proceedings of 16th IMACS World Congress 2000 on Scientific Computing, Applications Mathematics and Simulation %D 2000 %T Seamless Access to Adaptive Solver Algorithms %A Dorian Arnold %A Susan Blackford %A Jack Dongarra %A Victor Eijkhout %A Tinghua Xu %K netsolve %B Proceedings of 16th IMACS World Congress 2000 on Scientific Computing, Applications Mathematics and Simulation %C Lausanne, Switzerland %8 2000-08 %G eng %0 Generic %D 1999 %T On the Existence Problem of Incomplete Factorisation Methods %A Victor Eijkhout %B University of Tennessee Computer Science Department Technical Report %8 1999-12 %G eng %0 Journal Article %J Encyclopedia of Computer Science and Technology, eds. Kent, A., Williams, J. %D 1999 %T Numerical Linear Algebra %A Jack Dongarra %A Victor Eijkhout %E Marcel Dekker %B Encyclopedia of Computer Science and Technology, eds. Kent, A., Williams, J. %V 41 %P 207-233 %8 1999-08 %G eng %0 Journal Article %J Journal of Computational and Applied Mathematics %D 1999 %T Numerical Linear Algebra Algorithms and Software %A Jack Dongarra %A Victor Eijkhout %B Journal of Computational and Applied Mathematics %V 123 %P 489-514 %8 1999-10 %G eng %0 Generic %D 1999 %T The 'Weighted Modification' Incomplete Factorisation Method %A Victor Eijkhout %B University of Tennessee Computer Science Department Technical Report %8 1999-12 %G eng