%0 Conference Paper
%B 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS)
%D 2015
%T Hierarchical DAG scheduling for Hybrid Distributed Systems
%A Wei Wu
%A Aurelien Bouteiller
%A George Bosilca
%A Mathieu Faverge
%A Jack Dongarra
%K dense linear algebra
%K gpu
%K heterogeneous architecture
%K PaRSEC runtime
%X Accelerator-enhanced computing platforms have drawn a lot of attention due to their massive peak com-putational capacity. Despite significant advances in the pro-gramming interfaces to such hybrid architectures, traditional programming paradigms struggle mapping the resulting multi-dimensional heterogeneity and the expression of algorithm parallelism, resulting in sub-optimal effective performance. Task-based programming paradigms have the capability to alleviate some of the programming challenges on distributed hybrid many-core architectures. In this paper we take this concept a step further by showing that the potential of task-based programming paradigms can be greatly increased with minimal modification of the underlying runtime combined with the right algorithmic changes. We propose two novel recursive algorithmic variants for one-sided factorizations and describe the changes to the PaRSEC task-scheduling runtime to build a framework where the task granularity is dynamically adjusted to adapt the degree of available parallelism and kernel effi-ciency according to runtime conditions. Based on an extensive set of results we show that, with one-sided factorizations, i.e. Cholesky and QR, a carefully written algorithm, supported by an adaptive tasks-based runtime, is capable of reaching a degree of performance and scalability never achieved before in distributed hybrid environments.
%B 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS)
%I IEEE
%C Hyderabad, India
%8 2015-05
%G eng
%0 Journal Article
%J Parallel Computing
%D 2014
%T An Efficient Distributed Randomized Algorithm for Solving Large Dense Symmetric Indefinite Linear Systems
%A Marc Baboulin
%A Du Becker
%A George Bosilca
%A Anthony Danalis
%A Jack Dongarra
%K Distributed linear algebra solvers
%K LDLT factorization
%K PaRSEC runtime
%K Randomized algorithms
%K Symmetric indefinite systems
%X Randomized algorithms are gaining ground in high-performance computing applications as they have the potential to outperform deterministic methods, while still providing accurate results. We propose a randomized solver for distributed multicore architectures to efficiently solve large dense symmetric indefinite linear systems that are encountered, for instance, in parameter estimation problems or electromagnetism simulations. The contribution of this paper is to propose efficient kernels for applying random butterfly transformations and a new distributed implementation combined with a runtime (PaRSEC) that automatically adjusts data structures, data mappings, and the scheduling as systems scale up. Both the parallel distributed solver and the supporting runtime environment are innovative. To our knowledge, the randomization approach associated with this solver has never been used in public domain software for symmetric indefinite systems. The underlying runtime framework allows seamless data mapping and task scheduling, mapping its capabilities to the underlying hardware features of heterogeneous distributed architectures. The performance of our software is similar to that obtained for symmetric positive definite systems, but requires only half the execution time and half the amount of data storage of a general dense solver.
%B Parallel Computing
%V 40
%P 213-223
%8 2014-07
%G eng
%N 7
%R 10.1016/j.parco.2013.12.003