Publications

Export 75 results:
Filters: Author is Thomas Herault  [Clear All Filters]
Book Chapter
Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, J. Kurzak, P. Luszczek, S. Tomov, and J. Dongarra, Scalable Dense Linear Algebra on Heterogeneous Hardware,” HPC: Transition Towards Exascale Processing, in the series Advances in Parallel Computing, 2013.  (760.32 KB)
Hori, A., Y. Tsujita, A. Shimada, K. Yoshinaga, N. Mitaro, G. Fukazawa, M. Sato, G. Bosilca, A. Bouteiller, and T. Herault, System Software for Many-Core and Multi-core Architecture,” Advanced Software Technologies for Post-Peta Scale Computing: The Japanese Post-Peta CREST Research Project, Singapore, Springer Singapore, pp. 59–75, 2019. DOI: 10.1007/978-981-13-1924-2_4
Conference Paper
Bosilca, G., A. Bouteiller, T. Herault, Y. Robert, and J. Dongarra, Assessing the Impact of ABFT and Checkpoint Composite Strategies,” 16th Workshop on Advances in Parallel and Distributed Computational Models, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (1.02 MB)
Cao, C., G. Bosilca, T. Herault, and J. Dongarra, Design for a Soft Error Resilient Dynamic Task-based Runtime,” 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Hyderabad, India, IEEE, May 2015.  (2.31 MB)
Le Fèvre, V., G. Bosilca, A. Bouteiller, T. Herault, A. Hori, Y. Robert, and J. Dongarra, Do moldable applications perform better on failure-prone HPC platforms?,” 11th Workshop on Resiliency in High Performance Computing in Clusters, Clouds, and Grids, Turin, Italy, Springer Verlag, August 2018.  (360.72 KB)
Herault, T., Y. Robert, G. Bosilca, and J. Dongarra, Generic Matrix Multiplication for Multi-GPU Accelerated Distributed-Memory Platforms over PaRSEC,” ScalA'19: 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Denver, CO, IEEE, November 2019.  (260.69 KB)
Bouteiller, A., F. Cappello, J. Dongarra, A. Guermouche, T. Herault, and Y. Robert, Multi-criteria Checkpointing Strategies: Response-Time versus Resource Utilization,” Euro-Par 2013, Aachen, Germany, Springer, August 2013.  (431.84 KB)
Bouteiller, A., T. Herault, and G. Bosilca, A Multithreaded Communication Substrate for OpenSHMEM,” 8th International Conference on Partitioned Global Address Space Programming Models (PGAS), Eugene, OR, October 2014.  (261.66 KB)
Herault, T., Y. Robert, A. Bouteiller, D. Arnold, K. Ferreira, G. Bosilca, and J. Dongarra, Optimal Cooperative Checkpointing for Shared High-Performance Computing Platforms,” 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Best Paper Award, Vancouver, BC, Canada, IEEE, May 2018. DOI: 10.1109/IPDPSW.2018.00127  (899.3 KB)
Herault, T., A. Bouteiller, G. Bosilca, M. Gamell, K. Teranishi, M. Parashar, and J. Dongarra, Practical Scalable Consensus for Pseudo-Synchronous Distributed Systems,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.  (550.96 KB)
Danalis, A., G. Bosilca, A. Bouteiller, T. Herault, and J. Dongarra, PTG: An Abstraction for Unhindered Parallelism,” International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC), New Orleans, LA, IEEE Press, November 2014.  (480.05 KB)
Benoit, A., T. Herault, V. Le Fèvre, and Y. Robert, Replication is More Efficient Than You Think,” The IEEE/ACM Conference on High Performance Computing Networking, Storage and Analysis (SC19), Denver, CO, ACM Press, November 2019.  (975.69 KB)
Dongarra, J., T. Herault, and Y. Robert, Revisiting the Double Checkpointing Algorithm,” 15th Workshop on Advances in Parallel and Distributed Computational Models, at the IEEE International Parallel & Distributed Processing Symposium, Boston, MA, May 2013.  (591.1 KB)
Danalis, A., H. Jagode, T. Herault, P. Luszczek, and J. Dongarra, Software-Defined Events through PAPI,” 24th International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS), Rio de Janeiro, Brazil, IEEE, May 2019.  (446.41 KB)
Conference Proceedings
Du, P., A. Bouteiller, G. Bosilca, T. Herault, and J. Dongarra, Algorithm-Based Fault Tolerance for Dense Matrix Factorization,” Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2012, New Orleans, LA, USA, ACM, pp. 225-234, February 2012. DOI: 10.1145/2145816.2145845  (865.79 KB)
Bland, W., P. Du, A. Bouteiller, T. Herault, G. Bosilca, and J. Dongarra, A Checkpoint-on-Failure Protocol for Algorithm-Based Recovery in Standard MPI,” 18th International European Conference on Parallel and Distributed Computing (Euro-Par 2012) (Best Paper Award), Rhodes, Greece, Springer-Verlag, August 2012.  (289.32 KB)
Bouteiller, A., T. Herault, G. Bosilca, and J. Dongarra, Correlated Set Coordination in Fault Tolerant Message Logging Protocols,” Proceedings of 17th International Conference, Euro-Par 2011, Part II, vol. 6853, Bordeaux, France, Springer, pp. 51-64, August 2011.  (486.68 KB)
Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Lemariner, and J. Dongarra, DAGuE: A Generic Distributed DAG Engine for High Performance Computing,” Proceedings of the Workshops of the 25th IEEE International Symposium on Parallel and Distributed Processing (IPDPS 2011 Workshops), Anchorage, Alaska, USA, IEEE, pp. 1151-1158, 00 2011.  (830.85 KB)
Bosilca, G., A. Bouteiller, T. Herault, P. Lemariner, and J. Dongarra, Dodging the Cost of Unavoidable Memory Copies in Message Logging Protocols,” Proceedings of EuroMPI 2010, Stuttgart, Germany, Springer, September 2010.  (202.87 KB)
Hoque, R., T. Herault, G. Bosilca, and J. Dongarra, Dynamic Task Discovery in PaRSEC- A data-flow task-based Runtime,” ScalA17, Denver, ACM, September 2017. DOI: 10.1145/3148226.3148233  (1.15 MB)
Bland, W., A. Bouteiller, T. Herault, J. Hursey, G. Bosilca, and J. Dongarra, An Evaluation of User-Level Failure Mitigation Support in MPI,” Proceedings of Recent Advances in Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI 2012, Vienna, Austria, Springer, September 2012.
Bosilca, G., A. Bouteiller, A. Guermouche, T. Herault, Y. Robert, P. Sens, and J. Dongarra, Failure Detection and Propagation in HPC Systems,” Proceedings of the The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'16), Salt Lake City, Utah, IEEE Press, pp. 27:1-27:11, November 2016.
Bosilca, G., A. Bouteiller, A. Danalis, M. Faverge, A. Haidar, T. Herault, J. Kurzak, J. Langou, P. Lemariner, H. Ltaeif, et al., Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA,” Proceedings of the Workshops of the 25th IEEE International Symposium on Parallel and Distributed Processing (IPDPS 2011 Workshops), Anchorage, Alaska, USA, IEEE, pp. 1432-1441, May 2011.  (1.26 MB)
Tang, C., A. Bouteiller, T. Herault, M G. Venkata, and G. Bosilca, From MPI to OpenSHMEM: Porting LAMMPS,” OpenSHMEM and Related Technologies. Experiences, Implementations, and Technologies, Annapolis, MD, USA, Springer International Publishing, pp. 121–137, 2015. DOI: 10.1007/978-3-319-26428-8_8
Dongarra, J., M. Faverge, T. Herault, J. Langou, and Y. Robert, Hierarchical QR Factorization Algorithms for Multi-Core Cluster Systems,” IPDPS 2012, the 26th IEEE International Parallel and Distributed Processing Symposium, Shanghai, China, IEEE Computer Society Press, May 2012.  (405.71 KB)
Ma, T., T. Herault, G. Bosilca, and J. Dongarra, Process Distance-aware Adaptive MPI Collective Communications,” IEEE Int'l Conference on Cluster Computing (Cluster 2011), Austin, Texas, 00 2011.
Agullo, E., C. Coti, J. Dongarra, T. Herault, and J. Langou, QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment,” 24th IEEE International Parallel and Distributed Processing Symposium (also LAWN 224), Atlanta, GA, April 2010.  (261.55 KB)
Bosilca, G., T. Herault, A.. Rezmerita, and J. Dongarra, On Scalability for MPI Runtime Systems,” International Conference on Cluster Computing (CLUSTER), Austin, TX, USA, IEEEE, pp. 187-195, September 2011.  (898.76 KB)
Bosilca, G., T. Herault, P. Lemariner, J. Dongarra, and A.. Rezmerita, Scalable Runtime for MPI: Efficiently Building the Communication Infrastructure,” Proceedings of Recent Advances in the Message Passing Interface - 18th European MPI Users' Group Meeting, EuroMPI 2011, vol. 6960, Santorini, Greece, Springer, pp. 342-344, September 2011.  (115.75 KB)
Bosilca, G., A. Bouteiller, T. Herault, P. Lemariner, N. Ohm Saengpatsa, S. Tomov, and J. Dongarra, A Unified HPC Environment for Hybrid Manycore/GPU Distributed Systems,” IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 2011.
Journal Article
Bouteiller, A., T. Herault, G. Bosilca, P. Du, and J. Dongarra, Algorithm-based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures, and Accuracy,” ACM Transactions on Parallel Computing, vol. 1, issue 2, no. 10, pp. 10:1-10:28, January 2015. DOI: 10.1145/2686892  (1.14 MB)
Seo, S., A. Amer, P. Balaji, C. Bordage, G. Bosilca, A. Brooks, P. Carns, A. Castello, D. Genet, T. Herault, et al., Argobots: A Lightweight Low-Level Threading and Tasking Framework,” IEEE Transactions on Parallel and Distributed Systems, October 2017. DOI: 10.1109/TPDS.2017.2766062
Herrmann, J., G. Bosilca, T. Herault, L. Marchal, Y. Robert, and J. Dongarra, Assessing the Cost of Redistribution followed by a Computational Kernel: Complexity and Performance Results,” Parallel Computing, vol. 52, pp. 22-41, February 2016. DOI: doi:10.1016/j.parco.2015.09.005  (2.06 MB)
Herault, T., Y. Robert, A. Bouteiller, D. Arnold, K. Ferreira, G. Bosilca, and J. Dongarra, Checkpointing Strategies for Shared High-Performance Computing Platforms,” International Journal of Networking and Computing, vol. 9, no. 1, pp. 28–52, 2019.
Le Fèvre, V., T. Herault, Y. Robert, A. Bouteiller, A. Hori, G. Bosilca, and J. Dongarra, Comparing the Performance of Rigid, Moldable, and Grid-Shaped Applications on Failure-Prone HPC Platforms,” Parallel Computing, vol. 85, pp. 1–12, July 2019. DOI: 10.1016/j.parco.2019.02.002  (865.18 KB)
Bosilca, G., A. Bouteiller, T. Herault, Y. Robert, and J. Dongarra, Composing Resilience Techniques: ABFT, Periodic, and Incremental Checkpointing,” International Journal of Networking and Computing, vol. 5, no. 1, pp. 2-15, January 2015.  (755.54 KB)
Bosilca, G., C. Coti, T. Herault, P. Lemariner, and J. Dongarra, Constructing Resiliant Communication Infrastructure for Runtime Environments in Advances in Parallel Computing,” Advances in Parallel Computing - Parallel Computing: From Multicores and GPU's to Petascale, vol. 19, pp. 441-451, 2010. DOI: 10.3233/978-1-60750-530-3-441
Lemariner, P., G. Bosilca, C. Coti, T. Herault, and J. Dongarra, Constructing Resilient Communication Infrastructure for Runtime Environments,” ParCo 2009, Lyon France, September 2009.
Bouteiller, A., T. Herault, G. Bosilca, and J. Dongarra, Correlated Set Coordination in Fault Tolerant Message Logging Protocols,” Concurrency and Computation: Practice and Experience, vol. 25, issue 4, pp. 572-585, March 2013. DOI: 10.1002/cpe.2859  (636.68 KB)
Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Lemariner, and J. Dongarra, DAGuE: A generic distributed DAG Engine for High Performance Computing.,” Parallel Computing, vol. 38, no. 1-2: Elsevier, pp. 27-51, 00 2012.  (830.85 KB)
Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Luszczek, and J. Dongarra, Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach,” Scalable Computing and Communications: Theory and Practice: John Wiley & Sons, pp. 699-735, March 2013.  (1.01 MB)
Bland, W., A. Bouteiller, T. Herault, J. Hursey, G. Bosilca, and J. Dongarra, An evaluation of User-Level Failure Mitigation support in MPI,” Computing, vol. 95, issue 12, pp. 1171-1184, December 2013. DOI: 10.1007/s00607-013-0331-3  (311.23 KB)
Bland, W., P. Du, A. Bouteiller, T. Herault, G. Bosilca, and J. Dongarra, Extending the scope of the Checkpoint-on-Failure protocol for forward recovery in standard MPI,” Concurrency and Computation: Practice and Experience, July 2013. DOI: 10.1002/cpe.3100  (3.89 MB)
Bosilca, G., A. Bouteiller, A. Guermouche, T. Herault, Y. Robert, P. Sens, and J. Dongarra, A Failure Detector for HPC Platforms,” The International Journal of High Performance Computing Applications, vol. 32, issue 1, pp. 139–158, January 2018. DOI: 10.1177/1094342017711505  (1.04 MB)
Danalis, A., A. Bouteiller, G. Bosilca, J. Dongarra, and T. Herault, From Serial Loops to Parallel Execution on Distributed Systems,” PPoPP 2012 (submitted), New Orleans, LA, February 2012.  (319.5 KB)
Dongarra, J., M. Faverge, T. Herault, M. Jacquelin, J. Langou, and Y. Robert, Hierarchical QR Factorization Algorithms for Multi-core Cluster Systems,” Parallel Computing, vol. 39, issue 4-5, pp. 212-232, May 2013.  (1.43 MB)
Bosilca, G., A. Bouteiller, A. Danalis, M. Faverge, T. Herault, and J. Dongarra, PaRSEC: Exploiting Heterogeneity to Enhance Scalability,” IEEE Computing in Science and Engineering, vol. 15, issue 6, pp. 36-45, November 2013. DOI: 10.1109/MCSE.2013.98  (2.16 MB)
Dongarra, J., T. Herault, and Y. Robert, Performance and Reliability Trade-offs for the Double Checkpointing Algorithm,” International Journal of Networking and Computing, vol. 4, no. 1, pp. 32-41.  (859.04 KB)
Bosilca, G., A. Bouteiller, T. Herault, P. Lemariner, N. Ohm Saengpatsa, S. Tomov, and J. Dongarra, Performance Portability of a GPU Enabled Factorization with the DAGuE Framework,” IEEE Cluster: workshop on Parallel Programming on Accelerator Clusters (PPAC), June 2011.  (290.98 KB)
Bland, W., A. Bouteiller, T. Herault, G. Bosilca, and J. Dongarra, Post-failure recovery of MPI communication capability: Design and rationale,” International Journal of High Performance Computing Applications, vol. 27, issue 3, pp. 244 - 254, January 2013. DOI: 10.1177/1094342013488238  (285.77 KB)

Pages