Publications

Export 125 results:
Filters: Author is George Bosilca  [Clear All Filters]
Conference Proceedings
Bosilca, G., T. Herault, P. Lemariner, J. Dongarra, and A.. Rezmerita, Scalable Runtime for MPI: Efficiently Building the Communication Infrastructure,” Proceedings of Recent Advances in the Message Passing Interface - 18th European MPI Users' Group Meeting, EuroMPI 2011, vol. 6960, Santorini, Greece, Springer, pp. 342-344, September 2011.  (115.75 KB)
Angskun, T., G. Bosilca, and J. Dongarra, Self-Healing in Binomial Graph Networks,” 2nd International Workshop On Reliability in Decentralized Distributed Systems (RDDS 2007), Vilamoura, Algarve, Portugal, November 2007.  (322.39 KB)
Angskun, T., G. Fagg, G. Bosilca, J. Pjesivac–Grbovic, and J. Dongarra, Self-Healing Network for Scalable Fault Tolerant Runtime Environments,” DAPSYS 2006, 6th Austrian-Hungarian Workshop on Distributed and Parallel Systems, Innsbruck, Austria, January 2006.  (162.83 KB)
Bosilca, G., A. Bouteiller, T. Herault, P. Lemariner, N. Ohm Saengpatsa, S. Tomov, and J. Dongarra, A Unified HPC Environment for Hybrid Manycore/GPU Distributed Systems,” IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 2011.
Journal Article
Bouteiller, A., T. Herault, G. Bosilca, P. Du, and J. Dongarra, Algorithm-based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures, and Accuracy,” ACM Transactions on Parallel Computing, vol. 1, issue 2, no. 10, pp. 10:1-10:28, January 2015.  (1.14 MB)
Dongarra, J., G. Bosilca, R. Delmas, and J. Langou, Algorithmic Based Fault Tolerance Applied to High Performance Computing,” Journal of Parallel and Distributed Computing, vol. 69, pp. 410-416, 00-2009.  (313.55 KB)
Seo, S., A. Amer, P. Balaji, C. Bordage, G. Bosilca, A. Brooks, P. Carns, A. Castello, D. Genet, T. Herault, et al., Argobots: A Lightweight Low-Level Threading and Tasking Framework,” IEEE Transactions on Parallel and Distributed Systems, October 2017.
Herrmann, J., G. Bosilca, T. Herault, L. Marchal, Y. Robert, and J. Dongarra, Assessing the Cost of Redistribution followed by a Computational Kernel: Complexity and Performance Results,” Parallel Computing, vol. 52, pp. 22-41, February 2016.  (2.06 MB)
Herault, T., Y. Robert, A. Bouteiller, D. Arnold, K. Ferreira, G. Bosilca, and J. Dongarra, Checkpointing Strategies for Shared High-Performance Computing Platforms,” International Journal of Networking and Computing, vol. 9, no. 1, pp. 28–52, 2019.
Le Fèvre, V., T. Herault, Y. Robert, A. Bouteiller, A. Hori, G. Bosilca, and J. Dongarra, Comparing the Performance of Rigid, Moldable, and Grid-Shaped Applications on Failure-Prone HPC Platforms,” Parallel Computing, vol. 85, pp. 1–12, July 2019.  (865.18 KB)
Graham, R. L., G. Bosilca, and J. Pjesivac–Grbovic, A Comparison of Application Performance Using Open MPI and Cray MPI,” Cray User Group, CUG 2007, May 2007.  (248.83 KB)
Bosilca, G., A. Bouteiller, T. Herault, Y. Robert, and J. Dongarra, Composing Resilience Techniques: ABFT, Periodic, and Incremental Checkpointing,” International Journal of Networking and Computing, vol. 5, no. 1, pp. 2-15, January 2015.  (755.54 KB)
Bosilca, G., C. Coti, T. Herault, P. Lemariner, and J. Dongarra, Constructing Resiliant Communication Infrastructure for Runtime Environments in Advances in Parallel Computing,” Advances in Parallel Computing - Parallel Computing: From Multicores and GPU's to Petascale, vol. 19, pp. 441-451, 2010.
Lemariner, P., G. Bosilca, C. Coti, T. Herault, and J. Dongarra, Constructing Resilient Communication Infrastructure for Runtime Environments,” ParCo 2009, Lyon France, September 2009.
Bouteiller, A., T. Herault, G. Bosilca, and J. Dongarra, Correlated Set Coordination in Fault Tolerant Message Logging Protocols,” Concurrency and Computation: Practice and Experience, vol. 25, issue 4, pp. 572-585, March 2013.  (636.68 KB)
Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Lemariner, and J. Dongarra, DAGuE: A generic distributed DAG Engine for High Performance Computing.,” Parallel Computing, vol. 38, no. 1-2: Elsevier, pp. 27-51, 00-2012.  (830.85 KB)
Pjesivac–Grbovic, J., G. Bosilca, G. Fagg, T. Angskun, and J. Dongarra, Decision Trees and MPI Collective Algorithm Selection Problem,” Euro-Par 2007, Rennes, France, Springer, pp. 105–115, August 2007.  (552.94 KB)
Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Luszczek, and J. Dongarra, Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach,” Scalable Computing and Communications: Theory and Practice: John Wiley & Sons, pp. 699-735, March 2013.  (1.01 MB)
Dongarra, J., Z. Chen, G. Bosilca, and J. Langou, Disaster Survival Guide in Petascale Computing: An Algorithmic Approach,” in Petascale Computing: Algorithms and Applications (to appear): Chapman & Hall - CRC Press, 00-2007.  (260.18 KB)
Baboulin, M., D. Becker, G. Bosilca, A. Danalis, and J. Dongarra, An Efficient Distributed Randomized Algorithm for Solving Large Dense Symmetric Indefinite Linear Systems,” Parallel Computing, vol. 40, issue 7, pp. 213-223, July 2014.  (1.42 MB)
Graham, R. L., R. Brightwell, B. Barrett, G. Bosilca, and J. Pjesivac–Grbovic, An Evaluation of Open MPI's Matching Transport Layer on the Cray XT,” EuroPVM/MPI 2007, September 2007.  (369.01 KB)
Bland, W., A. Bouteiller, T. Herault, J. Hursey, G. Bosilca, and J. Dongarra, An evaluation of User-Level Failure Mitigation support in MPI,” Computing, vol. 95, issue 12, pp. 1171-1184, December 2013.  (311.23 KB)
Bland, W., P. Du, A. Bouteiller, T. Herault, G. Bosilca, and J. Dongarra, Extending the scope of the Checkpoint-on-Failure protocol for forward recovery in standard MPI,” Concurrency and Computation: Practice and Experience, July 2013.  (3.89 MB)
Bosilca, G., A. Bouteiller, A. Guermouche, T. Herault, Y. Robert, P. Sens, and J. Dongarra, A Failure Detector for HPC Platforms,” The International Journal of High Performance Computing Applications, vol. 32, issue 1, pp. 139–158, January 2018.  (1.04 MB)
Fagg, G., J. Pjesivac–Grbovic, G. Bosilca, T. Angskun, and J. Dongarra, Flexible collective communication tuning architecture applied to Open MPI,” 2006 Euro PVM/MPI (submitted), Bonn, Germany, January 2006.  (206.58 KB)
Danalis, A., A. Bouteiller, G. Bosilca, J. Dongarra, and T. Herault, From Serial Loops to Parallel Execution on Distributed Systems,” PPoPP 2012 (submitted), New Orleans, LA, February 2012.  (319.5 KB)
Ma, T., G. Bosilca, A. Bouteiller, and J. Dongarra, HierKNEM: An Adaptive Framework for Kernel-Assisted and Topology-Aware Collective Communications on Many-core Clusters,” IPDPS 2012 (Best Paper), Shanghai, China, May 2012.  (165.9 KB)
Shipman, G. M., G. Bosilca, and A. B. Maccabe, High Performance RDMA Protocols in HPC,” Euro PVM/MPI 2006, Bonn, Germany, September 2006.  (1.06 MB)
Graham, R. L., G. M. Shipman, B. Barrett, R. Castain, G. Bosilca, and A. Lumsdaine, A High-Performance, Heterogeneous MPI,” HeteroPar 2006, Barcelona, Spain, September 2006.  (193.73 KB)
Ma, T., A. Bouteiller, G. Bosilca, and J. Dongarra, Impact of Kernel-Assisted MPI Communication over Scientific Applications: CPMD and FFTW,” 18th EuroMPI, Santorini, Greece, Springer, pp. 247-254, September 2011.
Keller, R., G. Bosilca, G. Fagg, M. Resch, and J. Dongarra, Implementation and Usage of the PERUSE-Interface in Open MPI,” Euro PVM/MPI 2006, Bonn, Germany, September 2006.  (310.76 KB)
Ma, T., G. Bosilca, A. Bouteiller, and J. Dongarra, Kernel-assisted and topology-aware MPI collective communications on multi-core/many-core platforms,” Journal of Parallel and Distributed Computing, vol. 73, issue 7, pp. 1000-1010, July 2013.  (1.4 MB)
Losada, N., G. Bosilca, A. Bouteiller, P. González, and M. J. Martín, Local Rollback for Resilient MPI Applications with Application-Level Checkpointing and Message Logging,” Future Generation Computer Systems, vol. 91, pp. 450-464, February 2019.  (1.16 MB)
Agullo, E., G. Bosilca, C. Castagnède, J. Dongarra, H. Ltaeif, and S. Tomov, Matrices Over Runtime Systems at Exascale,” Supercomputing '12 (poster), Salt Lake City, Utah, November 2012.
Pjesivac–Grbovic, J., G. Fagg, T. Angskun, G. Bosilca, and J. Dongarra, MPI Collective Algorithm Selection and Quadtree Encoding,” Lecture Notes in Computer Science, vol. 4192, no. ICL-UT-06-13: Springer Berlin / Heidelberg, pp. 40-48, September 2006.  (308.39 KB)
Pjesivac–Grbovic, J., G. Bosilca, G. Fagg, T. Angskun, and J. Dongarra, MPI Collective Algorithm Selection and Quadtree Encoding,” Parallel Computing (Special Edition: EuroPVM/MPI 2006): Elsevier, 00-2007.  (308.39 KB)
Chaarawi, M., E. Gabriel, R. Keller, R. L. Graham, G. Bosilca, and J. Dongarra, OMPIO: A Modular Software Architecture for MPI I/O,” 18th EuroMPI, Santorini, Greece, Springer, pp. 81-89, September 2011.
Bosilca, G., A. Bouteiller, A. Danalis, M. Faverge, T. Herault, and J. Dongarra, PaRSEC: Exploiting Heterogeneity to Enhance Scalability,” IEEE Computing in Science and Engineering, vol. 15, issue 6, pp. 36-45, November 2013.  (2.16 MB)
Pjesivac–Grbovic, J., T. Angskun, G. Bosilca, G. Fagg, E. Gabriel, and J. Dongarra, Performance Analysis of MPI Collective Operations,” Cluster computing, vol. 10, no. 2: Springer Netherlands, pp. 127-143, June 2007.  (1018.28 KB)
Pjesivac–Grbovic, J., T. Angskun, G. Bosilca, G. Fagg, E. Gabriel, and J. Dongarra, Performance Analysis of MPI Collective Operations,” Cluster Computing Journal (to appear), January 2005.  (1018.28 KB)
Bosilca, G., A. Bouteiller, T. Herault, P. Lemariner, N. Ohm Saengpatsa, S. Tomov, and J. Dongarra, Performance Portability of a GPU Enabled Factorization with the DAGuE Framework,” IEEE Cluster: workshop on Parallel Programming on Accelerator Clusters (PPAC), June 2011.  (290.98 KB)
Bland, W., A. Bouteiller, T. Herault, G. Bosilca, and J. Dongarra, Post-failure recovery of MPI communication capability: Design and rationale,” International Journal of High Performance Computing Applications, vol. 27, issue 3, pp. 244 - 254, January 2013.  (285.77 KB)
Fagg, G., E. Gabriel, Z. Chen, T. Angskun, G. Bosilca, J. Pjesivac–Grbovic, and J. Dongarra, Process Fault-Tolerance: Semantics, Design and Applications for High Performance Computing,” International Journal for High Performance Applications and Supercomputing (to appear), April 2004.  (186.9 KB)
Langou, J., Z. Chen, G. Bosilca, and J. Dongarra, Recovery Patterns for Iterative Methods in a Parallel Unstable Environment,” SIAM SISC (to appear), May 2007.  (241.36 KB)
Bouteiller, A., G. Bosilca, and J. Dongarra, Redesigning the Message Logging Model for High Performance,” Concurrency and Computation: Practice and Experience (online version), June 2010.  (438.42 KB)
Bouteiller, A., G. Bosilca, and J. Dongarra, Retrospect: Deterministic Relay of MPI Applications for Interactive Distributed Debugging,” Accepted for Euro PVM/MPI 2007: Springer, September 2007.
Angskun, T., G. Fagg, G. Bosilca, J. Pjesivac–Grbovic, and J. Dongarra, Scalable Fault Tolerant Protocol for Parallel Runtime Environments,” 2006 Euro PVM/MPI, no. ICL-UT-06-12, Bonn, Germany, 00-2006.  (149.07 KB)
Bosilca, G., Z. Chen, J. Dongarra, V. Eijkhout, G. Fagg, E. Fuentes, J. Langou, P. Luszczek, J. Pjesivac–Grbovic, K. Seymour, et al., Self Adapting Numerical Software SANS Effort,” IBM Journal of Research and Development, vol. 50, no. 2/3, pp. 223-238, January 2006.  (357.53 KB)
Angskun, T., G. Fagg, G. Bosilca, J. Pjesivac–Grbovic, and J. Dongarra, Self-Healing Network for Scalable Fault-Tolerant Runtime Environments,” Future Generation Computer Systems, vol. 26, no. 3, pp. 479-485, March 2010.  (1.54 MB)
Bernholdt, D. E., S. Boehm, G. Bosilca, M. Grentla Venkata, R. E. Grant, T. Naughton, H. P. Pritchard, M. Schulz, and G. R. Vallee, A Survey of MPI Usage in the US Exascale Computing Project,” Concurrency Computation: Practice and Experience, September 2018.  (359.54 KB)

Pages