Publications

Export 13 results:
Filters: Keyword is MPI  [Clear All Filters]
2019
Losada, N., A. Bouteiller, and G. Bosilca, Asynchronous Receiver-Driven Replay for Local Rollback of MPI Applications,” Fault Tolerance for HPC at eXtreme Scale (FTXS) Workshop at The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC'19), November 2019.  (440.7 KB)
Patinyasakdikul, T., D. Eberius, G. Bosilca, and N. Hjelm, Give MPI Threading a Fair Chance: A Study of Multithreaded MPI Designs,” IEEE Cluster, Albuquerque, NM, IEEE, September 2019.  (220.84 KB)
Losada, N., G. Bosilca, A. Bouteiller, P. González, and M. J. Martín, Local Rollback for Resilient MPI Applications with Application-Level Checkpointing and Message Logging,” Future Generation Computer Systems, vol. 91, pp. 450-464, February 2019. DOI: 10.1016/j.future.2018.09.041  (1.16 MB)
2018
Bosilca, G., A. Bouteiller, A. Guermouche, T. Herault, Y. Robert, P. Sens, and J. Dongarra, A Failure Detector for HPC Platforms,” The International Journal of High Performance Computing Applications, vol. 32, issue 1, pp. 139–158, January 2018. DOI: 10.1177/1094342017711505  (1.04 MB)
Bernholdt, D. E., S. Boehm, G. Bosilca, M G. Venkata, R. E. Grant, T. Naughton, H. P. Pritchard, M. Schulz, and G. R. Vallee, A Survey of MPI Usage in the US Exascale Computing Project,” Concurrency Computation: Practice and Experience, September 2018. DOI: 10.1002/cpe.4851  (359.54 KB)
2016
Bosilca, G., A. Bouteiller, A. Guermouche, T. Herault, Y. Robert, P. Sens, and J. Dongarra, Failure Detection and Propagation in HPC Systems,” Proceedings of the The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'16), Salt Lake City, Utah, IEEE Press, pp. 27:1-27:11, November 2016.
Wu, W., G. Bosilca, R. vandeVaart, S. Jeaugey, and J. Dongarra, GPU-Aware Non-contiguous Data Movement In Open MPI,” 25th International Symposium on High-Performance Parallel and Distributed Computing (HPDC'16), Kyoto, Japan, ACM, June 2016. DOI: http://dx.doi.org/10.1145/2907294.2907317  (482.32 KB)
2015
Shamis, P.., M G. Venkata, M. G. Lopez, M.. B. Baker, O.. Hernandez, Y.. Itigin, M.. Dubman, G.. Shainer, R.. L. Graham, L.. Liss, et al., UCX: An Open Source Framework for HPC Network APIs and Beyond,” 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, Santa Clara, CA, USA, IEEE, pp. 40-43, 2015. DOI: 10.1109/HOTI.2015.13