Publications

Aupy, G., A. Benoit, H. Casanova, and Y. Robert, “Scheduling Computational Workflows on Failure-prone Platforms,” International Journal of Networking and Computing, vol. 6, no. 1, pp. 2-26, 2016.

(503.81 KB)

Aupy, G., and Y. Robert, “Scheduling for Fault-Tolerance: An Introduction,” Topics in Parallel and Distributed Computing: Springer International Publishing, pp. 143–170, 2018.

Aupy, G., Y. Robert, and F. Vivien, “Assuming failure independence: are we right to be wrong?,” The 3rd International Workshop on Fault Tolerant Systems (FTS), Honolulu, Hawaii, IEEE, September 2017.

(597.11 KB)

Aupy, G., A. Benoit, L. Pottier, P. Raghavan, Y. Robert, and M. Shantharam, “Co-Scheduling Algorithms for Cache-Partitioned Systems,” 19th Workshop on Advances in Parallel and Distributed Computational Models, Orlando, FL, IEEE Computer Society Press, May 2017.

(584.76 KB)

Aupy, G., A. Benoit, T. Herault, Y. Robert, F. Vivien, and D. Zaidouni, “On the Combination of Silent Error Detection and Checkpointing,” UT-CS-13-710: University of Tennessee Computer Science Technical Report, June 2013.

(1.29 MB)

Aupy, G., A. Benoit, B. Goglin, L. Pottier, and Y. Robert, “Co-Scheduling HPC Workloads on Cache-Partitioned CMP Platforms,” International Journal of High Performance Computing Applications, vol. 33, issue 6, pp. 1221-1239, November 2019.

(930.28 KB)

Aupy, G., A. Benoit, S. Dai, L. Pottier, P. Raghavan, Y. Robert, and M. Shantharam, “Co-Scheduling Amdhal Applications on Cache-Partitioned Systems,” International Journal of High Performance Computing Applications, vol. 32, issue 1, pp. 123–138, January 2018.

(672.52 KB)

Aupy, G., A. Benoit, T. Herault, Y. Robert, and J. Dongarra, “Optimal Checkpointing Period: Time vs. Energy,” University of Tennessee Computer Science Technical Report (also LAWN 281), no. ut-eecs-13-718: University of Tennessee, October 2013.

(440.13 KB)

Aupy, G., A. Benoit, B. Goglin, L. Pottier, and Y. Robert, “Co-Scheduling HPC Workloads on Cache-Partitioned CMP Platforms,” Cluster 2018, Belfast, UK, IEEE Computer Society Press, September 2018.

(423.75 KB)

Aupy, G., A. Gainaru, V. Honoré, P. Raghavan, Y. Robert, and H. Sun, “Reservation Strategies for Stochastic Jobs,” 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2019), Rio de Janeiro, Brazil, IEEE Computer Society Press, May 2019.

(808.93 KB)

Asch, M., T. Moore, R. M. Badia, M. Beck, P. Beckman, T. Bidot, F. Bodin, F. Cappello, A. Choudhary, B. R. de Supinski, et al., “Big Data and Extreme-Scale Computing: Pathways to Convergence - Toward a Shaping Strategy for a Future Software and Data Ecosystem for Scientific Inquiry,” The International Journal of High Performance Computing Applications, vol. 32, issue 4, pp. 435–479, July 2018.

(1.29 MB)

Arnold, D., S. Browne, J. Dongarra, G. Fagg, and K. Moore, “Secure Remote Access to Numerical Software and Computational Hardware,” Proceedings of the DoD HPC Users Group Conference (HPCUG) 2000, Albuquerque, NM, June 2000.

(172.6 KB)

Arnold, D., S. Browne, J. Dongarra, G. Fagg, and K. Moore, “Secure Remote Access to Numerical Software and Computation Hardware,” University of Tennessee Computer Science Technical Report, UT-CS-00-446, July 2000.

(402.31 KB)

Arnold, D., S. Blackford, J. Dongarra, V. Eijkhout, and T. Xu, “Seamless Access to Adaptive Solver Algorithms,” Proceedings of 16th IMACS World Congress 2000 on Scientific Computing, Applications Mathematics and Simulation, Lausanne, Switzerland, August 2000.

(151.42 KB)

Arnold, D., W. Lee, J. Dongarra, and M. Wheeler, “Providing Infrastructure and Interface to High Performance Applications in a Distributed Setting,” ASTC-HPC 2000, Washington, DC, April 2000.

(96.04 KB)

Arnold, D., D. Bachmann, and J. Dongarra, “Request Sequencing: Optimizing Communication for the Grid,” Lecture Notes in Computer Science: Proceedings of 6th International Euro-Par Conference 2000, Parallel Processing, (Germany: Springer Verlag 2000), pp. V1900,1213-1222, January 2000.

(165.92 KB)

Arnold, D., H. Casanova, and J. Dongarra, “Innovations of the NetSolve Grid Computing System,” Concurrency: Practice and Experience, vol. 14, no. 13-15, pp. 1457-1479, January 2002.

(311.31 KB)

Arnold, D., and J. Dongarra, “The NetSolve Environment: Progressing Towards the Seamless Grid,” 2000 International Conference on Parallel Processing (ICPP-2000), Toronto, Canada, August 2000.

(148.85 KB)

Arnold, D., S. Vadhiyar, and J. Dongarra, “On the Convergence of Computational and Data Grids,” Parallel Processing Letters, vol. 11, no. 2-3, pp. 187-202, January 2001.

(213.35 KB)

Arnold, D., and J. Dongarra, “Developing an Architecture to Support the Implementation and Development of Scientific Computing Applications,” to appear in Proceedings of Working Conference 8: Software Architecture for Scientific Computing Applications, Ottawa, Canada, October 2000.

(176.25 KB)

Archibald, R., E. Chow, E. D'Azevedo, J. Dongarra, M. Eisenbach, R. Febbo, F. Lopez, D. Nichols, S. Tomov, K. Wong, et al., “Integrating Deep Learning in Domain Sciences at Exascale,” 2020 Smoky Mountains Computational Sciences and Engineering Conference (SMC 2020), August 2020.

Archibald, R., E. Chow, E. D'Azevedo, J. Dongarra, M. Eisenbach, R. Febbo, F. Lopez, D. Nichols, S. Tomov, K. Wong, et al., “Integrating Deep Learning in Domain Sciences at Exascale,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-10: University of Tennessee, August 2020.

(1.09 MB)

Arbenz, P., A. Cleary, J. Dongarra, and M. Hegland, “A Comparison of Parallel Solvers for General Narrow Banded Linear Systems,” Parallel and Distributed Computing Practices, vol. 2, pp. 385-400, October 2002.

(304.96 KB)

Arbenz, P., A. Cleary, J. Dongarra, and M. Hegland, “A Comparison of Parallel Solvers for General Narrow Banded Linear Systems (LAPACK Working Note 142),” University of Tennessee Computer Science Technical Report, no. UT-CS-99-414, January 1999.

(304.96 KB)

Arbenz, P., A. Cleary, J. Dongarra, and M. Hegland, “A Comparison of Parallel Solvers for Diagonally Dominant and General Narrow Banded Linear Systems II (LAPACK Working Note 143),” University of Tennessee Computer Science Department Technical Report, no. UT-CS-99-415, January 1999.

(174.46 KB)

Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Orti, “Variable-Size Batched Gauss-Jordan Elimination for Block-Jacobi Preconditioning on Graphics Processors,” Parallel Computing, vol. 81, pp. 131-146, January 2019.

(1.9 MB)

Anzt, H., J. Dongarra, G. Flegar, and T. Gruetzmacher, “Variable-Size Batched Condition Number Calculation on GPUs,” SBAC-PAD, Lyon, France, September 2018.

(509.3 KB)

Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, “Weighted Block-Asynchronous Iteration on GPU-Accelerated Systems,” Tenth International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (Best Paper), Rhodes Island, Greece, August 2012.

(764.02 KB)

Anzt, H., B. Haugen, J. Kurzak, P. Luszczek, and J. Dongarra, “Experiences in Autotuning Matrix Multiplication for Energy Minimization on GPUs,” Concurrency and Computation: Practice and Experience, vol. 27, issue 17, pp. 5096 - 5113, Oct 12, 2015.

(1.99 MB)

Anzt, H., J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, “Efficiency of General Krylov Methods on GPUs – An Experimental Study,” The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), Chicago, IL, IEEE, May 2016.

(285.28 KB)

Anzt, H., P. Luszczek, J. Dongarra, and V. Heuveline, “GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement,” EuroPar 2012 (also LAWN 260), Rhodes Island, Greece, August 2012.

(662.98 KB)

Anzt, H., M. Baboulin, J. Dongarra, Y. Fournier, F. Hulsemann, A. Khabou, and Y. Wang, “Accelerating the Conjugate Gradient Algorithm with GPU in CFD Simulations,” VECPAR, 2016.

Anzt, H., E. Ponce, G. D. Peterson, and J. Dongarra, “GPU-accelerated Co-design of Induced Dimension Reduction: Algorithmic Fusion and Kernel Overlap,” 2nd International Workshop on Hardware-Software Co-Design for High Performance Computing, Austin, TX, ACM, November 2015.

(1.46 MB)

Anzt, H., T. Ribizel, G. Flegar, E. Chow, and J. Dongarra, “ParILUT – A Parallel Threshold ILU for GPUs,” IEEE International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.

(505.95 KB)

Anzt, H., J. Dongarra, M. Gates, A. Haidar, K. Kabir, P. Luszczek, S. Tomov, and I. Yamazaki, MAGMA MIC: Optimizing Linear Algebra for Intel Xeon Phi , Frankfurt, Germany, ISC High Performance (ISC15), Intel Booth Presentation, June 2015.

(2.03 MB)

Anzt, H., T. Cojean, Y-C. Chen, F. Goebel, T. Gruetzmacher, P. Nayak, T. Ribizel, and Y-H. Tsai, “Ginkgo: A High Performance Numerical Linear Algebra Library,” Journal of Open Source Software, vol. 5, issue 52, August 2020.

(721.84 KB)

Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, “A Block-Asynchronous Relaxation Method for Graphics Processing Units,” University of Tennessee Computer Science Technical Report, no. UT-CS-11-687 / LAWN 258, November 2011.

(1.08 MB)

Anzt, H., E. Chow, and J. Dongarra, “On block-asynchronous execution on GPUs,” LAPACK Working Note, no. 291, November 2016.

(1.05 MB)

Anzt, H., N. Beams, T. Cojean, F. Göbel, T. Grützmacher, A. Kashi, P. Nayak, T. Ribizel, and Y. M. Tsai, Gingko: A Sparse Linear Algebrea Library for HPC : 2021 ECP Annual Meeting, April 2021.

(893.04 KB)

Anzt, H., Y. M. Tsai, A. Abdelfattah, T. Cojean, and J. Dongarra, “Evaluating the Performance of NVIDIA’s A100 Ampere GPU for Sparse and Batched Computations,” 2020 IEEE/ACM Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS): IEEE, November 2020.

(1.9 MB)

Anzt, H., M. Kreutzer, E. Ponce, G. D. Peterson, G. Wellein, and J. Dongarra, “Optimization and Performance Evaluation of the IDR Iterative Krylov Solver on GPUs,” The International Journal of High Performance Computing Applications, vol. 32, no. 2, pp. 220–230, March 2018.

(2.08 MB)

Anzt, H., J. Dongarra, G. Flegar, N. J. Higham, and E. S. Quintana-Orti, “Adaptive Precision in Block-Jacobi Preconditioning for Iterative Sparse Linear System Solvers,” Concurrency and Computation: Practice and Experience, vol. 31, no. 6, pp. e4460, March 2019.

(341.54 KB)

Anzt, H., and J. Dongarra, “A Jaccard Weights Kernel Leveraging Independent Thread Scheduling on GPUs,” SBAC-PAD, Lyon, France, IEEE, 2018.

(237.68 KB)

Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Orti, “Batched Gauss-Jordan Elimination for Block-Jacobi Preconditioner Generation on GPUs,” Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, New York, NY, USA, ACM, pp. 1–10, February 2017.

(552.62 KB)

Anzt, H., G. Collins, J. Dongarra, G. Flegar, and E. S. Quintana-Orti, Flexible Batched Sparse Matrix Vector Product on GPUs , Denver, Colorado, ScalA'17: 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, November 2017.

(16.8 MB)

Anzt, H., J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, “Efficiency of General Krylov Methods on GPUs – An Experimental Study,” 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 683-691, May 2016.

Anzt, H., J. Dongarra, and E. S. Quintana-Orti, “Adaptive Precision Solvers for Sparse Linear Systems,” 3rd International Workshop on Energy Efficient Supercomputing (E2SC '15), Austin, TX, ACM, November 2015.

Thiyagalingam, J., G. von Laszewski, J. Yin, M. Emani, J. Papay, G. Barrett, P. Luszczek, A. Tsaris, C. Kirkpatrick, F. Wang, et al., “AI Benchmarking for Science: Efforts from the MLCommons Science Working Group,” Lecture Notes in Computer Science, vol. 13387: Springer International Publishing, pp. 47 - 64, January 2023.

Anzt, H., B. Haugen, J. Kurzak, P. Luszczek, and J. Dongarra, “Experiences in autotuning matrix multiplication for energy minimization on GPUs,” Concurrency in Computation: Practice and Experience, vol. 27, issue 17, pp. 5096-5113, December 2015.

(1.98 MB)

Anzt, H., E. Chow, and J. Dongarra, “ParILUT - A New Parallel Threshold ILU,” SIAM Journal on Scientific Computing, vol. 40, issue 4: SIAM, pp. C503–C519, July 2018.

(19.26 MB)

Main menu

Pages