Publications

Search

Show only items where

Author

Type

Term

Year

Keyword

Export 1275 results:

Conference Paper

Jagode, H., A. Danalis, and J. Dongarra, “What it Takes to keep PAPI Instrumental for the HPC Community,” 1st Workshop on Sustainable Scientific Software (CW3S19), Collegeville, Minnesota, July 2019.

(50.57 KB)

Haugen, B., S. Richmond, J. Kurzak, C. A. Steed, and J. Dongarra, “Visualizing Execution Traces with Task Dependencies,” 2nd Workshop on Visual Performance Analysis (VPA '15), Austin, TX, ACM, November 2015.

(927.5 KB)

Kurzak, J., P. Luszczek, M. Gates, I. Yamazaki, and J. Dongarra, “Virtual Systolic Array for QR Decomposition,” 15th Workshop on Advances in Parallel and Distributed Computational Models, IEEE International Parallel & Distributed Processing Symposium (IPDPS 2013), Boston, MA, IEEE, May 2013.

(749.84 KB)

Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Orti, “Variable-Size Batched LU for Small Matrices and Its Integration into Block-Jacobi Preconditioning,” 46th International Conference on Parallel Processing (ICPP), Bristol, United Kingdom, IEEE, August 2017.

Anzt, H., J. Dongarra, G. Flegar, and T. Gruetzmacher, “Variable-Size Batched Condition Number Calculation on GPUs,” SBAC-PAD, Lyon, France, September 2018.

(509.3 KB)

McCraw, H., A. Danalis, G. Bosilca, J. Dongarra, K. Kowalski, and T. Windus, “Utilizing Dataflow-based Execution for Coupled Cluster Methods,” 2014 IEEE International Conference on Cluster Computing, no. ICL-UT-14-02, Madrid, Spain, IEEE, September 2014.

(260.23 KB)

Eberius, D., T. Patinyasakdikul, and G. Bosilca, “Using Software-Based Performance Counters to Expose Low-Level Open MPI Performance Information,” EuroMPI, Chicago, IL, ACM, September 2017.

(745.58 KB)

Dongarra, J., K. London, S. Moore, P. Mucci, and D. Terpstra, “Using PAPI for Hardware Performance Monitoring on Linux Systems,” Conference on Linux Clusters: The HPC Revolution, Urbana, Illinois, Linux Clusters Institute, June 2001.

(422.35 KB)

Haidar, A., S. Tomov, A. Abdelfattah, M. Zounon, and J. Dongarra, “Using GPU FP16 Tensor Cores Arithmetic to Accelerate Mixed-Precision Iterative Refinement Solvers and Reduce Energy Consumption,” ISC High Performance (ISC'18), Best Poster, Frankfurt, Germany, June 2018.

(3.01 MB)

Zhong, D., P. Shamis, Q. Cao, G. Bosilca, and J. Dongarra, “Using Arm Scalable Vector Extension to Optimize Open MPI,” 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID 2020), Melbourne, Australia, IEEE/ACM, May 2020.

(359.95 KB)

Zhong, D., Q. Cao, G. Bosilca, and J. Dongarra, “Using Advanced Vector Extensions AVX-512 for MPI Reduction,” EuroMPI/USA '20: 27th European MPI Users' Group Meeting, Austin, TX, September 2020.

(634.45 KB)

Lindquist, N., P. Luszczek, and J. Dongarra, “Using Additive Modifications in LU Factorization Instead of Pivoting,” 37th ACM International Conference on Supercomputing (ICS'23), Orlando, FL, ACM, June 2023.

(624.18 KB)

Haidar, A., C. Cao, J. Dongarra, P. Luszczek, and S. Tomov, “Unified Development for Mixed Multi-GPU and Multi-Coprocessor Environments using a Lightweight Runtime Environment,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014.

(1.51 MB)

Li, J., B. Nicolae, J. M. Wozniak, and G. Bosilca, “Understanding Scalability and Fine-Grain Parallelism of Synchronous Data Parallel Training,” 2019 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC), Denver, CO, IEEE, November 2019.

(696.89 KB)

Krzhizhanovskaya, V., G. Závodszky, M. Lees, J. Dongarra, P. Sloot, S. Brissos, and J. Teixeira, “Twenty Years of Computational Science,” International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, June 2020.

(149.66 KB)

Anzt, H., J. Dongarra, and E. S. Quintana-Orti, “Tuning Stationary Iterative Solvers for Fault Resilience,” 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA15), Austin, TX, ACM, November 2015.

(1.28 MB)

Yamazaki, I., T. Dong, S. Tomov, and J. Dongarra, “Tridiagonalization of a Symmetric Dense Matrix on a GPU Cluster,” The Third International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), May 2013.

Tseng, S-M., B. Nicolae, G. Bosilca, E. Jeannot, A. Chandramowlishwaran, and F. Cappello, “Towards Portable Online Prediction of Network Utilization Using MPI-Level Monitoring,” 2019 European Conference on Parallel Processing (Euro-Par 2019), Göttingen, Germany, Springer, August 2019.

(1.07 MB)

Luszczek, P., J. Kurzak, I. Yamazaki, and J. Dongarra, “Towards Numerical Benchmark for Half-Precision Floating Point Arithmetic,” 2017 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, IEEE, September 2017.

(1.67 MB)

Abdelfattah, A., S. Tomov, and J. Dongarra, “Towards Half-Precision Computation for Complex Matrices: A Case Study for Mixed Precision Solvers on GPUs,” ScalA19: 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Denver, CO, IEEE, November 2019.

(523.87 KB)

(3.42 MB)

Anzt, H., Y. Chen Chen, T. Cojean, J. Dongarra, G. Flegar, P. Nayak, E. S. Quintana-Orti, Y. M. Tsai, and W. Wang, “Towards Continuous Benchmarking,” Platform for Advanced Scientific Computing Conference (PASC 2019), Zurich, Switzerland, ACM Press, June 2019.

(1.51 MB)

Haidar, A., P. Luszczek, S. Tomov, and J. Dongarra, “Towards Batched Linear Solvers on Accelerated Hardware Platforms,” 8th Workshop on General Purpose Processing Using GPUs (GPGPU 8) co-located with PPOPP 2015, San Francisco, CA, ACM, February 2015.

(403.74 KB)

Lopez, M. G., V. Larrea, W. Joubert, O. Hernandez, A. Haidar, S. Tomov, and J. Dongarra, “Towards Achieving Performance Portability Using Directives for Accelerators,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'16), Third Workshop on Accelerator Programming Using Directives (WACCPD), Salt Lake City, Utah, Innovative Computing Laboratory, University of Tennessee, November 2016.

(567.02 KB)

Haidar, A., M. Gates, S. Tomov, and J. Dongarra, “Toward a scalable multi-GPU eigensolver via compute-intensive kernels and efficient communication,” Proceedings of the 27th ACM International Conference on Supercomputing (ICS '13), Eugene, Oregon, USA, ACM Press, June 2013.

(1.27 MB)

Lindquist, N., M. Gates, P. Luszczek, and J. Dongarra, “Threshold Pivoting for Dense LU Factorization,” ScalAH22: 13th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems , Dallas, Texas, IEEE, November 2022.

(721.77 KB)

Bosilca, G., R. Harrison, T. Herault, M. Mahdi Javanmard, P. Nookala, and E. Valeev, “The Template Task Graph (TTG) - An Emerging Practical Dataflow Programming Paradigm for Scientific Simulation at Extreme Scale,” 2020 IEEE/ACM 5th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2): IEEE, November 2020.

(139.6 KB)

Boillot, L., G. Bosilca, E. Agullo, and H. Calandra, “Task-Based Programming for Seismic Imaging: Preliminary Results,” 2014 IEEE International Conference on High Performance Computing and Communications (HPCC), Paris, France, IEEE, August 2014.

(625.86 KB)

Sukkari, D., M. Gates, M. Al Farhan, H. Anzt, and J. Dongarra, “Task-Based Polar Decomposition Using SLATE on Massively Parallel Systems with Hardware Accelerators,” SC-W '23: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023.

Slaughter, E., W. Wu, Y. Fu, L. Brandenburg, N. Garcia, W. Kautz, E. Marx, K. S. Morris, Q. Cao, G. Bosilca, et al., “Task Bench: A Parameterized Benchmark for Evaluating Parallel Runtime Performance,” International Conference for High Performance Computing Networking, Storage, and Analysis (SC20): ACM, November 2020.

(644.92 KB)

Lacoste, X., M. Faverge, P. Ramet, S. Thibault, and G. Bosilca, “Taking Advantage of Hybrid Systems for Sparse Direct Solvers via Task-Based Runtimes,” 23rd International Heterogeneity in Computing Workshop, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.

(807.33 KB)

Luszczek, P., and C. Brown, “Surrogate ML/AI Model Benchmarking for FAIR Principles' Conformance,” 2022 IEEE High Performance Extreme Computing Conference (HPEC): IEEE, September 2022.

Dong, T., V. Dobrev, T. Kolev, R. Rieben, S. Tomov, and J. Dongarra, “A Step towards Energy Efficient Computing: Redesigning A Hydrodynamic Application on CPU-GPU,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014.

(1.01 MB)

Mattson, T., D. Bader, J. Berry, A. Buluc, J. Dongarra, C. Faloutsos, J. Feo, J. Gilbert, J. Gonzalez, B. Hendrickson, et al., “Standards for Graph Algorithm Primitives,” 17th IEEE High Performance Extreme Computing Conference (HPEC '13), Waltham, MA, IEEE, September 2013.

(108.86 KB)

Tsai, Y. M., T. Cojean, and H. Anzt, “Sparse Linear Algebra on AMD and NVIDIA GPUs—The Race is On,” ISC High Performance: Springer, June 2020.

(5.63 MB)

Danalis, A., H. Jagode, T. Herault, P. Luszczek, and J. Dongarra, “Software-Defined Events through PAPI,” 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, IEEE, May 2019.

(446.41 KB)

Gates, M., J. Kurzak, A. Charara, A. YarKhan, and J. Dongarra, “SLATE: Design of a Modern Distributed and Accelerated Linear Algebra Library,” International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), Denver, CO, ACM, November 2019.

(2.01 MB)

Anzt, H., D. Lukarski, S. Tomov, and J. Dongarra, “Self-Adaptive Multiprecision Preconditioners on Multicore and Manycore Architectures,” VECPAR 2014, Eugene, OR, June 2014.

(430.56 KB)

Haugen, B., and J. Kurzak, “Search Space Pruning Constraints Visualization,” VISSOFT'14: 2nd IEEE Working Conference on Software Visualization, Victoria, BC, Canada, IEEE, September 2014.

(1.32 MB)

Luszczek, P., M. Gates, J. Kurzak, A. Danalis, and J. Dongarra, “Search Space Generation and Pruning System for Autotuners,” 30th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Chicago, IL, IEEE, May 2016.

(555.44 KB)

Gao, Y., L-C. Canon, Y. Robert, and F. Vivien, “Scheduling Independent Stochastic Tasks on Heterogeneous Cloud Platforms,” IEEE Cluster 2019, Albuquerque, New Mexico, IEEE Computer Society Press, September 2019.

(651 KB)

Luszczek, P., J. Kurzak, I. Yamazaki, D. Keffer, and J. Dongarra, “Scaling Point Set Registration in 3D Across Thread Counts on Multicore and Hardware Accelerator Platforms through Autotuning for Large Scale Analysis of Scientific Point Clouds,” IEEE International Workshop on Benchmarking, Performance Tuning and Optimization for Big Data Applications (BPOD 2017), Boston, MA, IEEE, December 2017.

(6.71 MB)

Luszczek, P., Y. Tsai, N. Lindquist, H. Anzt, and J. Dongarra, “Scalable Data Generation for Evaluating Mixed-Precision Solvers,” 2020 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, IEEE, September 2020.

(1.3 MB)

Yamazaki, I., S. Tomov, and J. Dongarra, “Sampling Algorithms to Update Truncated SVD,” IEEE International Conference on Big Data, Boston, MA, IEEE, December 2017.

(700.79 KB)

Zhong, D., A. Bouteiller, X. Luo, and G. Bosilca, “Runtime Level Failure Detection and Propagation in HPC Systems,” European MPI Users' Group Meeting (EuroMPI '19), Zürich, Switzerland, ACM, September 2019.

(1.11 MB)

Du, Y., L. Marchal, G. Pallez, and Y. Robert, “Robustness of the Young/Daly Formula for Stochastic Iterative Applications,” 49th International Conference on Parallel Processing (ICPP 2020), Edmonton, AB, Canada, ACM Press, August 2020.

(1.11 MB)

Dongarra, J., T. Herault, and Y. Robert, “Revisiting the Double Checkpointing Algorithm,” 15th Workshop on Advances in Parallel and Distributed Computational Models, at the IEEE International Parallel & Distributed Processing Symposium, Boston, MA, May 2013.

(591.1 KB)

Bathie, G., L. Marchal, Y. Robert, and S. Thibault, “Revisiting Dynamic DAG Scheduling under Memory Constraints for Shared-Memory Platforms,” 22nd Workshop on Advances in Parallel and Distributed Computational Models (APDCM 2020), New Orleans, LA, IEEE Computer Society Press, May 2020.

(317.93 KB)

Fang, A., A. Cavelan, Y. Robert, and A. Chien, “Resilience for Stencil Computations with Latent Errors,” International Conference on Parallel Processing (ICPP), Bristol, UK, IEEE Computer Society Press, August 2017.

(1.19 MB)

Aupy, G., A. Gainaru, V. Honoré, P. Raghavan, Y. Robert, and H. Sun, “Reservation Strategies for Stochastic Jobs,” 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2019), Rio de Janeiro, Brazil, IEEE Computer Society Press, May 2019.

(808.93 KB)

Gainaru, A., B. Goglin, V. Honoré, P. Raghavan, G. Pallez, P. Raghavan, Y. Robert, and H. Sun, “Reservation and Checkpointing Strategies for Stochastic Jobs,” 34th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2020), New Orleans, LA, IEEE Computer Society Press, May 2020.

(692.4 KB)

Pages