%0 Journal Article %J The International Journal of High Performance Computing Applications %D 2018 %T Big Data and Extreme-Scale Computing: Pathways to Convergence - Toward a Shaping Strategy for a Future Software and Data Ecosystem for Scientific Inquiry %A Mark Asch %A Terry Moore %A Rosa M. Badia %A Micah Beck %A Pete Beckman %A Thierry Bidot %A François Bodin %A Franck Cappello %A Alok Choudhary %A Bronis R. de Supinski %A Ewa Deelman %A Jack Dongarra %A Anshu Dubey %A Geoffrey Fox %A Haohuan Fu %A Sergi Girona %A Michael Heroux %A Yutaka Ishikawa %A Kate Keahey %A David Keyes %A William T. Kramer %A Jean-François Lavignon %A Yutong Lu %A Satoshi Matsuoka %A Bernd Mohr %A Stéphane Requena %A Joel Saltz %A Thomas Schulthess %A Rick Stevens %A Martin Swany %A Alexander Szalay %A William Tang %A Gaël Varoquaux %A Jean-Pierre Vilotte %A Robert W. Wisniewski %A Zhiwei Xu %A Igor Zacharov %X Over the past four years, the Big Data and Exascale Computing (BDEC) project organized a series of five international workshops that aimed to explore the ways in which the new forms of data-centric discovery introduced by the ongoing revolution in high-end data analysis (HDA) might be integrated with the established, simulation-centric paradigm of the high-performance computing (HPC) community. Based on those meetings, we argue that the rapid proliferation of digital data generators, the unprecedented growth in the volume and diversity of the data they generate, and the intense evolution of the methods for analyzing and using that data are radically reshaping the landscape of scientific computing. The most critical problems involve the logistics of wide-area, multistage workflows that will move back and forth across the computing continuum, between the multitude of distributed sensors, instruments and other devices at the networks edge, and the centralized resources of commercial clouds and HPC centers. We suggest that the prospects for the future integration of technological infrastructures and research ecosystems need to be considered at three different levels. First, we discuss the convergence of research applications and workflows that establish a research paradigm that combines both HPC and HDA, where ongoing progress is already motivating efforts at the other two levels. Second, we offer an account of some of the problems involved with creating a converged infrastructure for peripheral environments, that is, a shared infrastructure that can be deployed throughout the network in a scalable manner to meet the highly diverse requirements for processing, communication, and buffering/storage of massive data workflows of many different scientific domains. Third, we focus on some opportunities for software ecosystem convergence in big, logically centralized facilities that execute large-scale simulations and models and/or perform large-scale data analytics. We close by offering some conclusions and recommendations for future investment and policy review. %B The International Journal of High Performance Computing Applications %V 32 %P 435–479 %8 2018-07 %G eng %N 4 %R https://doi.org/10.1177/1094342018778123 %0 Journal Article %J International Journal of High Performance Computing %D 2011 %T The International Exascale Software Project Roadmap %A Jack Dongarra %A Pete Beckman %A Terry Moore %A Patrick Aerts %A Giovanni Aloisio %A Jean-Claude Andre %A David Barkai %A Jean-Yves Berthou %A Taisuke Boku %A Bertrand Braunschweig %A Franck Cappello %A Barbara Chapman %A Xuebin Chi %A Alok Choudhary %A Sudip Dosanjh %A Thom Dunning %A Sandro Fiore %A Al Geist %A Bill Gropp %A Robert Harrison %A Mark Hereld %A Michael Heroux %A Adolfy Hoisie %A Koh Hotta %A Zhong Jin %A Yutaka Ishikawa %A Fred Johnson %A Sanjay Kale %A Richard Kenway %A David Keyes %A Bill Kramer %A Jesus Labarta %A Alain Lichnewsky %A Thomas Lippert %A Bob Lucas %A Barney MacCabe %A Satoshi Matsuoka %A Paul Messina %A Peter Michielse %A Bernd Mohr %A Matthias S. Mueller %A Wolfgang E. Nagel %A Hiroshi Nakashima %A Michael E. Papka %A Dan Reed %A Mitsuhisa Sato %A Ed Seidel %A John Shalf %A David Skinner %A Marc Snir %A Thomas Sterling %A Rick Stevens %A Fred Streitz %A Bob Sugar %A Shinji Sumimoto %A William Tang %A John Taylor %A Rajeev Thakur %A Anne Trefethen %A Mateo Valero %A Aad van der Steen %A Jeffrey Vetter %A Peg Williams %A Robert Wisniewski %A Kathy Yelick %X Over the last 20 years, the open-source community has provided more and more software on which the world’s high-performance computing systems depend for performance and productivity. The community has invested millions of dollars and years of effort to build key components. However, although the investments in these separate software elements have been tremendously valuable, a great deal of productivity has also been lost because of the lack of planning, coordination, and key integration of technologies necessary to make them work together smoothly and efficiently, both within individual petascale systems and between different systems. It seems clear that this completely uncoordinated development model will not provide the software needed to support the unprecedented parallelism required for peta/ exascale computation on millions of cores, or the flexibility required to exploit new hardware models and features, such as transactional memory, speculative execution, and graphics processing units. This report describes the work of the community to prepare for the challenges of exascale computing, ultimately combing their efforts in a coordinated International Exascale Software Project. %B International Journal of High Performance Computing %V 25 %P 3-60 %8 2011-01 %G eng %R https://doi.org/10.1177/1094342010391989 %0 Journal Article %J Lecture Notes in Computer Science, OpenMP Shared Memory Parallel Programming %D 2008 %T Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications %A Oscar Hernandez %A Fengguang Song %A Barbara Chapman %A Jack Dongarra %A Bernd Mohr %A Shirley Moore %A Felix Wolf %B Lecture Notes in Computer Science, OpenMP Shared Memory Parallel Programming %I Springer Berlin / Heidelberg %V 4315 %8 2008-00 %G eng %0 Conference Proceedings %B Proceedings of the 2nd International Workshop on Tools for High Performance Computing %D 2008 %T Usage of the Scalasca Toolset for Scalable Performance Analysis of Large-scale Parallel Applications %A Felix Wolf %A Brian Wylie %A Erika Abraham %A Wolfgang Frings %A Karl Fürlinger %A Markus Geimer %A Marc-Andre Hermanns %A Bernd Mohr %A Shirley Moore %A Matthias Pfeifer %E Michael Resch %E Rainer Keller %E Valentin Himmler %E Bettina Krammer %E A Schulz %K point %B Proceedings of the 2nd International Workshop on Tools for High Performance Computing %I Springer %C Stuttgart, Germany %P 157-167 %8 2008-01 %G eng %0 Journal Article %J Concurrency and Computation: Practice and Experience %D 2007 %T Automatic Analysis of Inefficiency Patterns in Parallel Applications %A Felix Wolf %A Bernd Mohr %A Jack Dongarra %A Shirley Moore %B Concurrency and Computation: Practice and Experience %V 19 %P 1481-1496 %8 2007-08 %G eng %0 Conference Proceedings %B 8th Workshop 'Parallel Systems and Algorithms' (PASA), Lecture Notes in Informatics %D 2006 %T Large Event Traces in Parallel Performance Analysis %A Felix Wolf %A Felix Freitag %A Bernd Mohr %A Shirley Moore %A Brian Wylie %K kojak %B 8th Workshop 'Parallel Systems and Algorithms' (PASA), Lecture Notes in Informatics %I Gesellschaft für Informatik %C Frankfurt/Main, Germany %8 2006-03 %G eng %0 Conference Proceedings %B Second International Workshop on OpenMP %D 2006 %T Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications %A Oscar Hernandez %A Fengguang Song %A Barbara Chapman %A Jack Dongarra %A Bernd Mohr %A Shirley Moore %A Felix Wolf %K kojak %B Second International Workshop on OpenMP %C Reims, France %8 2006-01 %G eng %0 Journal Article %J Concurrency and Computation: Practice and Experience, Special issue "Automatic Performance Analysis" (submitted) %D 2005 %T Automatic analysis of inefficiency patterns in parallel applications %A Felix Wolf %A Bernd Mohr %A Jack Dongarra %A Shirley Moore %K kojak %B Concurrency and Computation: Practice and Experience, Special issue "Automatic Performance Analysis" (submitted) %8 2005-00 %G eng %0 Conference Proceedings %B In Proceedings of the International Conference on Parallel Processing %D 2005 %T Automatic Experimental Analysis of Communication Patterns in Virtual Topologies %A Nikhil Bhatia %A Fengguang Song %A Felix Wolf %A Jack Dongarra %A Bernd Mohr %A Shirley Moore %K kojak %B In Proceedings of the International Conference on Parallel Processing %I IEEE Computer Society %C Oslo, Norway %8 2005-06 %G eng %0 Conference Proceedings %B In Proceedings of the European Conference on Parallel Computing (Euro-Par) %D 2005 %T Event-based Measurement and Analysis of One-sided Communication %A Marc-Andre Hermanns %A Bernd Mohr %A Felix Wolf %K kojak %B In Proceedings of the European Conference on Parallel Computing (Euro-Par) %I Springer %C Lisbon, Portugal %8 2005-08 %G eng %0 Conference Proceedings %B Second Workshop on Productivity and Performance in High-End Computing (P-PHEC) at 11th International Symposium on High Performance Computer Architecture (HPCA-2005) %D 2005 %T Improving Time to Solution with Automated Performance Analysis %A Shirley Moore %A Felix Wolf %A Jack Dongarra %A Bernd Mohr %K kojak %B Second Workshop on Productivity and Performance in High-End Computing (P-PHEC) at 11th International Symposium on High Performance Computer Architecture (HPCA-2005) %C San Francisco %8 2005-02 %G eng %0 Conference Proceedings %B Workshop on Patterns in High Performance Computing %D 2005 %T A Pattern-Based Approach to Automated Application Performance Analysis %A Nikhil Bhatia %A Shirley Moore %A Felix Wolf %A Jack Dongarra %A Bernd Mohr %K kojak %B Workshop on Patterns in High Performance Computing %C University of Illinois at Urbana-Champaign %8 2005-05 %G eng %0 Conference Proceedings %B Mini-Symposium "Tools Support for Parallel Programming", Proceedings of Parallel Computing (ParCo) %D 2005 %T Performance Analysis of One-sided Communication Mechanisms %A Bernd Mohr %A Andrej Kühnal %A Marc-Andre Hermanns %A Felix Wolf %K kojak %B Mini-Symposium "Tools Support for Parallel Programming", Proceedings of Parallel Computing (ParCo) %C Malaga, Spain %8 2005-09 %G eng %0 Conference Proceedings %B In Proc. of the 12th European Parallel Virtual Machine and Message Passing Interface Conference %D 2005 %T A Scalable Approach to MPI Application Performance Analysis %A Shirley Moore %A Felix Wolf %A Jack Dongarra %A Sameer Shende %A Allen D. Malony %A Bernd Mohr %K kojak %B In Proc. of the 12th European Parallel Virtual Machine and Message Passing Interface Conference %I Springer LNCS %8 2005-09 %G eng %0 Conference Proceedings %B Proceedings of Euro-Par 2004 %D 2004 %T Efficient Pattern Search in Large Traces through Successive Refinement %A Felix Wolf %A Bernd Mohr %A Jack Dongarra %A Shirley Moore %K kojak %B Proceedings of Euro-Par 2004 %I Springer-Verlag %C Pisa, Italy %8 2004-08 %G eng %0 Journal Article %J Journal of Systems Architecture, Special Issue 'Evolutions in parallel distributed and network-based processing' %D 2003 %T Automatic performance analysis of hybrid MPI/OpenMP applications %A Felix Wolf %A Bernd Mohr %E Andrea Clematis %E Daniele D'Agostino %K kojak %B Journal of Systems Architecture, Special Issue 'Evolutions in parallel distributed and network-based processing' %I Elsevier %V 49(10-11) %P 421-439 %8 2003-11 %G eng %0 Journal Article %J Advances in Parallel Computing %D 2003 %T Hardware-Counter Based Automatic Performance Analysis of Parallel Programs %A Felix Wolf %A Bernd Mohr %K kojak %K papi %X The KOJAK performance-analysis environment identifies a large number of performance problems on parallel computers with SMP nodes. The current version concentrates on parallelism-related performance problems that arise from an inefficient usage of the parallel programming interfaces MPI and OpenMP, while ignoring individual CPU performance. This chapter describes an extended design of KOJAK capable of diagnosing low individual-CPU performance based on hardware-counter information and of integrating the results with those of the parallelism-centered analysis. The performance of parallel applications is determined by a variety of different factors. Performance of single components frequently influences the overall behavior in unexpected ways. Application programmers on current parallel machines have to deal with numerous performance-critical aspects: different modes of parallel execution, such as message passing, multi-threading or even a combination of the two, and performance on individual CPU that is determined by the interaction of different functional units. The KOJAK analysis process is composed of two parts: a semi-automatic instrumentation of the user application followed by an automatic analysis of the generated performance data. KOJAK's instrumentation software runs on most major UNlX platforms and works on multiple levels, including source-code, compiler, and linker. %B Advances in Parallel Computing %I Elsevier %C Dresden, Germany %V 13 %P 753-760 %8 2004-01 %G eng %R https://doi.org/10.1016/S0927-5452(04)80092-3 %0 Conference Proceedings %B Proc. of the European Conference on Parallel Computing (EuroPar) %D 2003 %T KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Applications %A Bernd Mohr %A Felix Wolf %K kojak %B Proc. of the European Conference on Parallel Computing (EuroPar) %I Springer-Verlag %C Klagenfurt, Austria %V 2790 %P 1301-1304 %8 2003-08 %G eng