Publications

Export 995 results:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
A
Anzt, H., Y-C. Chen, T. Cojean, J. Dongarra, G. Flegar, P. Nayak, E. S. Quintana-Orti, Y. M. Tsai, W. Wang, and , Towards Continuous Benchmarking,” the Platform for Advanced Scientific Computing ConferenceProceedings of the Platform for Advanced Scientific Computing Conference on - PASC '19, Zurich, SwitzerlandNew York, New York, USA, ACM Press, 2019.  (1.51 MB)
Anzt, H., J. Dongarra, G. Flegar, and T. Gruetzmacher, Variable-Size Batched Condition Number Calculation on GPUs,” SBAC-PAD, Lyon, France, September 2018.  (509.3 KB)
Anzt, H., E. Chow, D. Szyld, and J. Dongarra, Random-Order Alternating Schwarz for Sparse Triangular Solves,” 2015 SIAM Conference on Applied Linear Algebra (SIAM LA), Atlanta, GA, SIAM, October 2015.  (1.53 MB)
Anzt, H., J. Dongarra, and V. Heuveline, Weighted Block-Asynchronous Relaxation for GPU-Accelerated Systems,” SIAM Journal on Computing (submitted), March 2012.  (811.01 KB)
Anzt, H., E. Chow, and J. Dongarra, Iterative Sparse Triangular Solves for Preconditioning,” EuroPar 2015, Vienna, Austria, Springer Berlin, August 2015.  (322.36 KB)
Anzt, H., E. Chow, D. Szyld, and J. Dongarra, Domain Overlap for Iterative Sparse Triangular Solves on GPUs,” Software for Exascale Computing - SPPEXA, vol. 113: Springer International Publishing, pp. 527–545, September 2016.
Anzt, H., S. Tomov, M. Gates, J. Dongarra, and V. Heuveline, Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems , no. UT-CS-11-689, December 2011.  (608.95 KB)
Anzt, H., S. Tomov, and J. Dongarra, Implementing a Sparse Matrix Vector Product for the SELL-C/SELL-C-σ formats on NVIDIA GPUs,” University of Tennessee Computer Science Technical Report, no. UT-EECS-14-727: University of Tennessee, April 2014.  (578.11 KB)
Anzt, H., E. Boman, J. Dongarra, G. Flegar, M. Gates, M. Heroux, M. Hoemmen, J. Kurzak, P. Luszczek, S. Rajamanickam, et al., MAGMA-sparse Interface Design Whitepaper,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-05, September 2017.  (1.28 MB)
Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, A Block-Asynchronous Relaxation Method for Graphics Processing Units,” Journal of Parallel and Distributed Computing, vol. 73, issue 12, pp. 1613–1626, December 2013.  (1.08 MB)
Anzt, H., S. Tomov, and J. Dongarra, On the performance and energy efficiency of sparse linear algebra on GPUs,” International Journal of High Performance Computing Applications, October 2016.
Anzt, H., T. Huckle, J. Bräckle, and J. Dongarra, Incomplete Sparse Approximate Inverses for Parallel Preconditioning,” Parallel Computing, vol. 71, pp. 1–22, January 2018.
Anzt, H., M. Kreutzer, E. Ponce, G. D. Peterson, G. Wellein, and J. Dongarra, Optimization and Performance Evaluation of the IDR Iterative Krylov Solver on GPUs,” The International Journal of High Performance Computing Applications, vol. 32, no. 2, pp. 220–230, March 2018.  (2.08 MB)
Anzt, H., G. Flegar, T. Grützmacher, and E. S. Quintana-Ortí, Toward a Modular Precision Ecosystem for High-Performance Computing,” The International Journal of High Performance Computing Applications, September 2019.  (1.93 MB)
Anzt, H., and J. Dongarra, A Jaccard Weights Kernel Leveraging Independent Thread Scheduling on GPUs,” SBAC-PAD, 2018.  (237.68 KB)
Anzt, H., E. Chow, J. Saak, and J. Dongarra, Updating Incomplete Factorization Preconditioners for Model Order Reduction,” Numerical Algorithms, vol. 73, issue 3, no. 3, pp. 611–630, February 2016.  (565.34 KB)
Anzt, H., S. Tomov, and J. Dongarra, Energy Efficiency and Performance Frontiers for Sparse Computations on GPU Supercomputers,” Sixth International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM '15), San Francisco, CA, ACM, February 2015.  (2.29 MB)
Anzt, H., G. Collins, J. Dongarra, G. Flegar, and E. S. Quintana-Ortí, Flexible Batched Sparse Matrix Vector Product on GPUs , Denver, Colorado, ScalA'17: 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, November 2017.  (16.8 MB)
Anzt, H., and G. Flegar, Are we Doing the Right Thing? — A Critical Analysis of the Academic HPC Community,” 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, IEEE, 2019.  (622.32 KB)
Anzt, H., S. Tomov, and J. Dongarra, Accelerating the LOBPCG method on GPUs using a blocked Sparse Matrix Vector Product,” Spring Simulation Multi-Conference 2015 (SpringSim'15), Alexandria, VA, SCS, April 2015.  (1.46 MB)
Anzt, H., W. Sawyer, S. Tomov, P. Luszczek, and J. Dongarra, Acceleration of GPU-based Krylov solvers via Data Transfer Reduction,” International Journal of High Performance Computing Applications, 2015.
Anzt, H., E. Chow, and J. Dongarra, ParILUT - A New Parallel Threshold ILU,” SIAM Journal on Scientific Computing, vol. 40, issue 4: SIAM, pp. C503–C519, July 2018.  (19.26 MB)
Anzt, H., S. Tomov, and J. Dongarra, Accelerating the LOBPCG method on GPUs using a blocked Sparse Matrix Vector Product,” University of Tennessee Computer Science Technical Report, no. UT-EECS-14-731: University of Tennessee, October 2014.  (1.83 MB)
Anzt, H., , and E. S. Quintana-Ortí, Fine-grained Bit-Flip Protection for Relaxation Methods,” Journal of Computational Science, November 2016.
Anzt, H., J. Dongarra, M. Gates, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, Bringing High Performance Computing to Big Data Algorithms,” Handbook of Big Data Technologies: Springer, 2017.
Anzt, H., B. Haugen, J. Kurzak, P. Luszczek, and J. Dongarra, Experiences in Autotuning Matrix Multiplication for Energy Minimization on GPUs,” Concurrency and Computation: Practice and Experience, vol. 27, issue 17, pp. 5096 - 5113, Oct-December 2015.  (1.99 MB)
Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, Weighted Block-Asynchronous Iteration on GPU-Accelerated Systems,” Tenth International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (Best Paper), Rhodes Island, Greece, August 2012.  (764.02 KB)
Anzt, H., P. Luszczek, J. Dongarra, and V. Heuveline, GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement,” EuroPar 2012 (also LAWN 260), Rhodes Island, Greece, August 2012.  (662.98 KB)
Anzt, H., M. Baboulin, J. Dongarra, Y. Fournier, F. Hulsemann, A. Khabou, and Y. Wang, Accelerating the Conjugate Gradient Algorithm with GPU in CFD Simulations,” VECPAR, 2016.
Anzt, H., I. Yamazaki, M. Hoemmen, E. Boman, and J. Dongarra, Solver Interface & Performance on Cori,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-05: University of Tennessee, June 2018.  (188.05 KB)
Anzt, H., J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, Efficiency of General Krylov Methods on GPUs – An Experimental Study,” The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), Chicago, IL, IEEE, May 2016.  (285.28 KB)
Angskun, T., G. Fagg, G. Bosilca, J. Pjesivac–Grbovic, and J. Dongarra, Self-Healing Network for Scalable Fault Tolerant Runtime Environments,” DAPSYS 2006, 6th Austrian-Hungarian Workshop on Distributed and Parallel Systems, Innsbruck, Austria, January 2006.  (162.83 KB)
Angskun, T., G. Bosilca, and J. Dongarra, Self-Healing in Binomial Graph Networks,” 2nd International Workshop On Reliability in Decentralized Distributed Systems (RDDS 2007), Vilamoura, Algarve, Portugal, November 2007.  (322.39 KB)
Angskun, T., G. Fagg, G. Bosilca, J. Pjesivac–Grbovic, and J. Dongarra, Self-Healing Network for Scalable Fault-Tolerant Runtime Environments,” Future Generation Computer Systems, vol. 26, no. 3, pp. 479-485, March 2010.  (1.54 MB)
Angskun, T., G. Bosilca, and J. Dongarra, Binomial Graph: A Scalable and Fault- Tolerant Logical Network Topology,” Proceedings of The Fifth International Symposium on Parallel and Distributed Processing and Applications (ISPA07), Niagara Falls, Canada, Springer, August 2007.  (480.47 KB)
Angskun, T., G. Fagg, G. Bosilca, J. Pjesivac–Grbovic, and J. Dongarra, Scalable Fault Tolerant Protocol for Parallel Runtime Environments,” 2006 Euro PVM/MPI, no. ICL-UT-06-12, Bonn, Germany, 00-2006.  (149.07 KB)
Angskun, T., G. Bosilca, G. Fagg, J. Pjesivac–Grbovic, and J. Dongarra, Reliability Analysis of Self-Healing Network using Discrete-Event Simulation,” Proceedings of Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07): IEEE Computer Society, pp. 437-444, May 2007.
Angskun, T., G. Bosilca, B. Vander Zanden, and J. Dongarra, Optimal Routing in Binomial Graph Networks,” The International Conference on Parallel and Distributed Computing, applications and Technologies (PDCAT), Adelaide, Australia, IEEE Computer Society, December 2007.
Andersson, U., and P. Mucci, Analysis and Optimization of Yee_Bench using Hardware Performance Counters,” Proceedings of Parallel Computing 2005 (ParCo) (to appear), Malaga, Spain, January 2005.  (72.27 KB)
Anderson, E., Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, et al., LAPACK Users' Guide, 3rd ed.,” Philadelphia: Society for Industrial and Applied Mathematics, January 1999.
Alvaro, W., J. Kurzak, and J. Dongarra, Optimizing Matrix Multiplication for a Short-Vector SIMD Architecture - CELL Processor,” Parallel Computing, vol. 35, pp. 138-150, 00-2009.  (591.16 KB)
Alvaro, W., J. Kurzak, and J. Dongarra, Fast and Small Short Vector SIMD Matrix Multiplication Kernels for the CELL Processor,” University of Tennessee Computer Science Technical Report, no. UT-CS-08-609, (also LAPACK Working Note 189), January 2008.  (500.99 KB)
Computational Science – ICCS 2009, Proceedings of the 9th International Conference,” Lecture Notes in Computer Science: Theoretical Computer Science and General Issues, vol. -, no. 5544-5545, Baton Rouge, LA, May 2009.
Aliaga, J. I., H. Anzt, M. Castillo, J. C. Fernández, G. León, J. Pérez, and E. S. Quintana-Ortí, Unveiling the Performance-energy Trade-off in Iterative Linear System Solvers for Multithreaded Processors,” Concurrency and Computation: Practice and Experience, vol. 27, issue 4, pp. 885-904, September 2014.  (1.83 MB)
Anzt, H., G. Collins, J. Dongarra, G. Flegar, and E. S. Quintana-Ortí, Flexible Batched Sparse Matrix-Vector Product on GPUs,” Proceedings of the 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA '17), Denver, Colorado, ACM Press, November 2017.
Alam, S., R. F. Barrett, H. Jagode, J. A.. Kuehn, S. W. Poole, and R.. Sankaran, Impact of Quad-core Cray XT4 System and Software Stack on Scientific Computation,” Euro-Par 2009, Lecture Notes in Computer Science, vol. 5704/2009, Delft, The Netherlands, Springer Berlin / Heidelberg, pp. 334-344, August 2009.  (312.74 KB)
Agullo, E., J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaeif, P. Luszczek, R. Nath, S. Tomov, et al., Numerical Linear Algebra on Emerging Architectures: The PLASMA and MAGMA Projects , Portland, OR, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC09), November 2009.  (3.53 MB)
Agullo, E., L. Giraud, A. Guermouche, A. Haidar, S. Lanteri, and J. Roman, Algebraic Schwarz Preconditioning for the Schur Complement: Application to the Time-Harmonic Maxwell Equations Discretized by a Discontinuous Galerkin Method.,” The Twentieth International Conference on Domain Decomposition Methods, La Jolla, California, February 2011.
Agullo, E., C. Augonnet, J. Dongarra, M. Faverge, H. Ltaeif, S. Thibault, and S. Tomov, QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators,” Proceedings of IPDPS 2011, no. ICL-UT-10-04, Anchorage, AK, October 2010.  (468.17 KB)
Agullo, E., C. Coti, J. Dongarra, T. Herault, and J. Langou, QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment,” 24th IEEE International Parallel and Distributed Processing Symposium (also LAWN 224), Atlanta, GA, April 2010.  (261.55 KB)

Pages