%0 Journal Article %J The International Journal of High Performance Computing Applications %D 2018 %T Accelerating NWChem Coupled Cluster through dataflow-based Execution %A Heike Jagode %A Anthony Danalis %A Jack Dongarra %K CCSD %K dag %K dataflow %K NWChem %K parsec %K ptg %K tasks %X Numerical techniques used for describing many-body systems, such as the Coupled Cluster methods (CC) of the quantum chemistry package NWCHEM, are of extreme interest to the computational chemistry community in fields such as catalytic reactions, solar energy, and bio-mass conversion. In spite of their importance, many of these computationally intensive algorithms have traditionally been thought of in a fairly linear fashion, or are parallelized in coarse chunks. In this paper, we present our effort of converting the NWCHEM’s CC code into a dataflow-based form that is capable of utilizing the task scheduling system PARSEC (Parallel Runtime Scheduling and Execution Controller): a software package designed to enable high-performance computing at scale. We discuss the modularity of our approach and explain how the PARSEC-enabled dataflow version of the subroutines seamlessly integrate into the NWCHEM codebase. Furthermore, we argue how the CC algorithms can be easily decomposed into finer-grained tasks (compared with the original version of NWCHEM); and how data distribution and load balancing are decoupled and can be tuned independently. We demonstrate performance acceleration by more than a factor of two in the execution of the entire CC component of NWCHEM, concluding that the utilization of dataflow-based execution for CC methods enables more efficient and scalable computation. %B The International Journal of High Performance Computing Applications %V 32 %P 540--551 %8 2018-07 %G eng %U http://journals.sagepub.com/doi/10.1177/1094342016672543 %N 4 %9 Journal Article %& 540 %R 10.1177/1094342016672543 %0 Journal Article %J Concurrency and Computation: Practice and Experience: Special Issue on Parallel and Distributed Algorithms %D 2018 %T Evaluation of Dataflow Programming Models for Electronic Structure Theory %A Heike Jagode %A Anthony Danalis %A Reazul Hoque %A Mathieu Faverge %A Jack Dongarra %K CCSD %K coupled cluster methods %K dataflow %K NWChem %K OpenMP %K parsec %K StarPU %K task-based runtime %X Dataflow programming models have been growing in popularity as a means to deliver a good balance between performance and portability in the post-petascale era. In this paper, we evaluate different dataflow programming models for electronic structure methods and compare them in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms for expressing scientific applications in a dataflow form: (1) explicit dataflow, where the dataflow is specified explicitly by the developer, and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task data-access information embedded in a serial program. We discuss our findings and present a thorough experimental analysis using methods from the NWChem quantum chemistry application as our case study, and OpenMP, StarPU, and PaRSEC as the task-based runtimes that enable the different forms of dataflow execution. Furthermore, we derive an abstract model to explore the limits of the different dataflow programming paradigms. %B Concurrency and Computation: Practice and Experience: Special Issue on Parallel and Distributed Algorithms %V 2018 %P 1–20 %8 2018-05 %G eng %N e4490 %R https://doi.org/10.1002/cpe.4490 %0 Journal Article %J The International Journal of High Performance Computing Applications %D 2017 %T Accelerating NWChem Coupled Cluster through Dataflow-Based Execution %A Heike Jagode %A Anthony Danalis %A Jack Dongarra %K CCSD %K dag %K dataflow %K NWChem %K parsec %K ptg %K tasks %X Numerical techniques used for describing many-body systems, such as the Coupled Cluster methods (CC) of the quantum chemistry package NWChem, are of extreme interest to the computational chemistry community in fields such as catalytic reactions, solar energy, and bio-mass conversion. In spite of their importance, many of these computationally intensive algorithms have traditionally been thought of in a fairly linear fashion, or are parallelized in coarse chunks. In this paper, we present our effort of converting the NWChem’s CC code into a dataflow-based form that is capable of utilizing the task scheduling system PaRSEC (Parallel Runtime Scheduling and Execution Controller): a software package designed to enable high-performance computing at scale. We discuss the modularity of our approach and explain how the PaRSEC-enabled dataflow version of the subroutines seamlessly integrate into the NWChem codebase. Furthermore, we argue how the CC algorithms can be easily decomposed into finer-grained tasks (compared with the original version of NWChem); and how data distribution and load balancing are decoupled and can be tuned independently. We demonstrate performance acceleration by more than a factor of two in the execution of the entire CC component of NWChem, concluding that the utilization of dataflow-based execution for CC methods enables more efficient and scalable computation. %B The International Journal of High Performance Computing Applications %P 1–13 %8 2017-01 %G eng %U http://journals.sagepub.com/doi/10.1177/1094342016672543 %R 10.1177/1094342016672543 %0 Conference Paper %B 11th International Conference on Parallel Processing and Applied Mathematics (PPAM 2015) %D 2015 %T Accelerating NWChem Coupled Cluster through dataflow-based Execution %A Heike Jagode %A Anthony Danalis %A George Bosilca %A Jack Dongarra %K CCSD %K dag %K dataflow %K NWChem %K parsec %K ptg %K tasks %X Numerical techniques used for describing many-body systems, such as the Coupled Cluster methods (CC) of the quantum chemistry package NWChem, are of extreme interest to the computational chemistry community in fields such as catalytic reactions, solar energy, and bio-mass conversion. In spite of their importance, many of these computationally intensive algorithms have traditionally been thought of in a fairly linear fashion, or are parallelised in coarse chunks. In this paper, we present our effort of converting the NWChem’s CC code into a dataflow-based form that is capable of utilizing the task scheduling system PaRSEC (Parallel Runtime Scheduling and Execution Controller) – a software package designed to enable high performance computing at scale. We discuss the modularity of our approach and explain how the PaRSEC-enabled dataflow version of the subroutines seamlessly integrate into the NWChem codebase. Furthermore, we argue how the CC algorithms can be easily decomposed into finer grained tasks (compared to the original version of NWChem); and how data distribution and load balancing are decoupled and can be tuned independently. We demonstrate performance acceleration by more than a factor of two in the execution of the entire CC component of NWChem, concluding that the utilization of dataflow-based execution for CC methods enables more efficient and scalable computation. %B 11th International Conference on Parallel Processing and Applied Mathematics (PPAM 2015) %I Springer International Publishing %C Krakow, Poland %8 2015-09 %G eng