%0 Conference Paper
%B 21st IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2020)
%D 2020
%T Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based Runtime
%A Yu Pei
%A Qinglei Cao
%A George Bosilca
%A Piotr Luszczek
%A Victor Eijkhout
%A Jack Dongarra
%B 21st IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2020)
%I IEEE
%C New Orleans, LA
%8 2020-05
%G eng
%0 Conference Paper
%B Platform for Advanced Scientific Computing Conference (PASC20)
%D 2020
%T Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications
%A Qinglei Cao
%A Yu Pei
%A Kadir Akbudak
%A Aleksandr Mikhalev
%A George Bosilca
%A Hatem Ltaief
%A David Keyes
%A Jack Dongarra
%X Climate and weather can be predicted statistically via geospatial Maximum Likelihood Estimates (MLE), as an alternative to running large ensembles of forward models. The MLE-based iterative optimization procedure requires the solving of large-scale linear systems that performs a Cholesky factorization on a symmetric positive-definite covariance matrix---a demanding dense factorization in terms of memory footprint and computation. We propose a novel solution to this problem: at the mathematical level, we reduce the computational requirement by exploiting the data sparsity structure of the matrix off-diagonal tiles by means of low-rank approximations; and, at the programming-paradigm level, we integrate PaRSEC, a dynamic, task-based runtime to reach unparalleled levels of efficiency for solving extreme-scale linear algebra matrix operations. The resulting solution leverages fine-grained computations to facilitate asynchronous execution while providing a flexible data distribution to mitigate load imbalance. Performance results are reported using 3D synthetic datasets up to 42M geospatial locations on 130, 000 cores, which represent a cornerstone toward fast and accurate predictions of environmental applications.
%B Platform for Advanced Scientific Computing Conference (PASC20)
%I ACM
%C Geneva, Switzerland
%8 2020-06
%G eng
%R https://doi.org/10.1145/3394277.3401846
%0 Conference Paper
%B 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID 2020)
%D 2020
%T Using Arm Scalable Vector Extension to optimize Open MPI
%A Dong Zhong
%A Pavel Shamis
%A Qinglei Cao
%A George Bosilca
%A Jack Dongarra
%B 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID 2020)
%I IEEE/ACM
%C Melbourne, Australia
%8 2020-05
%G eng
%0 Conference Paper
%B Workshop on Programming and Performance Visualization Tools (ProTools 19) at SC19
%D 2019
%T Performance Analysis of Tile Low-Rank Cholesky Factorization Using PaRSEC Instrumentation Tools
%A Qinglei Cao
%A Yu Pei
%A Thomas Herault
%A Kadir Akbudak
%A Aleksandr Mikhalev
%A George Bosilca
%A Hatem Ltaief
%A David Keyes
%A Jack Dongarra
%B Workshop on Programming and Performance Visualization Tools (ProTools 19) at SC19
%I ACM
%C Denver, CO
%8 2019-11
%G eng