Publications

Export 43 results:
Filters: Keyword is magma  [Clear All Filters]
2012
Baboulin, M., S. Donfack, J. Dongarra, L. Grigori, A. Remi, and S. Tomov, A Class of Communication-Avoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines,” Proc. of the International Conference on Computational Science (ICCS), vol. 9, pp. 17-26, June 2012.
Voemel, C., S. Tomov, and J. Dongarra, Divide and Conquer on Hybrid GPU-Accelerated Multicore Systems,” SIAM Journal on Scientific Computing, vol. 34(2), pp. C70-C82, April 2012.
Song, F., S. Tomov, and J. Dongarra, Enabling and Scaling Matrix Computations on Heterogeneous Multi-Core and Multi-GPU Systems,” 26th ACM International Conference on Supercomputing (ICS 2012), San Servolo Island, Venice, Italy, ACM, June 2012.  (5.88 MB)
Yamazaki, I., S. Tomov, and J. Dongarra, One-Sided Dense Matrix Factorizations on a Multicore with Multiple GPU Accelerators,” The International Conference on Computational Science (ICCS), June 2012.
Kasichayanula, K., D. Terpstra, P. Luszczek, S. Tomov, S. Moore, and G. D. Peterson, Power Aware Computing on GPUs,” SAAHPC '12 (Best Paper Award), Argonne, IL, July 2012.  (658.06 KB)
Song, F., and J. Dongarra, A Scalable Framework for Heterogeneous GPU-Based Clusters,” The 24th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2012), Pittsburgh, PA, USA, ACM, June 2012.  (3.39 MB)
2011
Baboulin, M., J. Dongarra, J. Herrmann, and S. Tomov, Accelerating Linear System Solutions Using Randomization Techniques,” INRIA RR-7616 / LAWN #246 (presented at International AMMCS’11), Waterloo, Ontario, Canada, July 2011.  (358.79 KB)
Kurzak, J., S. Tomov, and J. Dongarra, Autotuning GEMMs for Fermi,” University of Tennessee Computer Science Technical Report, UT-CS-11-671, (also Lawn 245), April 2011.  (397.45 KB)
Anzt, H., S. Tomov, M. Gates, J. Dongarra, and V. Heuveline, Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems , no. UT-CS-11-689, December 2011.  (608.95 KB)
Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, A Block-Asynchronous Relaxation Method for Graphics Processing Units,” University of Tennessee Computer Science Technical Report, no. UT-CS-11-687 / LAWN 258, November 2011.  (1.08 MB)
Horton, M., S. Tomov, and J. Dongarra, A Class of Hybrid LAPACK Algorithms for Multicore and GPU Architectures,” Symposium for Application Accelerators in High Performance Computing (SAAHPC'11), Knoxville, TN, July 2011.  (329.68 KB)
Song, F., S. Tomov, and J. Dongarra, Efficient Support for Matrix Computations on Heterogeneous Multi-core and Multi-GPU Architectures,” University of Tennessee Computer Science Technical Report, UT-CS-11-668, (also Lawn 250), June 2011.  (5.93 MB)
Anzt, H., P. Luszczek, J. Dongarra, and V. Heuveline, GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement,” University of Tennessee Computer Science Technical Report UT-CS-11-690 (also Lawn 260), December 2011.  (662.98 KB)
Dongarra, J., M. Faverge, T. Herault, J. Langou, and Y. Robert, Hierarchical QR Factorization Algorithms for Multi-Core Cluster Systems,” University of Tennessee Computer Science Technical Report (also Lawn 257), no. UT-CS-11-684, October 2011.  (405.71 KB)
Agullo, E., C. Augonnet, J. Dongarra, H. Ltaeif, R. Namyst, S. Thibault, and S. Tomov, A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs,” in GPU Computing Gems, Jade Edition, vol. 2: Elsevier, pp. 473-484, 00 2011.
Agullo, E., C. Augonnet, J. Dongarra, M. Faverge, J. Langou, H. Ltaeif, and S. Tomov, LU Factorization for Accelerator-Based Systems,” IEEE/ACS AICCSA 2011, Sharm-El-Sheikh, Egypt, December 2011.  (234.86 KB)
Nath, R., S. Tomov, T. Dong, and J. Dongarra, Optimizing Symmetric Dense Matrix-Vector Multiplication on GPUs,” ACM/IEEE Conference on Supercomputing (SC’11), Seattle, WA, November 2011.  (630.63 KB)
Malony, A. D., S. Biersdorff, S. Shende, H. Jagode, S. Tomov, G. Juckeland, R. Dietrich, D. Poole, and C. Lamb, Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs,” International Conference on Parallel Processing (ICPP'11), Taipei, Taiwan, ACM, September 2011. DOI: 10.1109/ICPP.2011.71  (1.41 MB)
Bosilca, G., A. Bouteiller, T. Herault, P. Lemariner, N. Ohm Saengpatsa, S. Tomov, and J. Dongarra, Performance Portability of a GPU Enabled Factorization with the DAGuE Framework,” IEEE Cluster: workshop on Parallel Programming on Accelerator Clusters (PPAC), June 2011.  (290.98 KB)
YarKhan, A., J. Kurzak, and J. Dongarra, QUARK Users' Guide: QUeueing And Runtime for Kernels,” University of Tennessee Innovative Computing Laboratory Technical Report, no. ICL-UT-11-02, 00 2011.  (247.12 KB)
Du, P., P. Luszczek, S. Tomov, and J. Dongarra, Soft Error Resilient QR Factorization for Hybrid System,” UT-CS-11-675 (also LAPACK Working Note #252), no. ICL-CS-11-675, July 2011.  (1.39 MB)
2010
Nath, R., S. Tomov, and J. Dongarra, Accelerating GPU Kernels for Dense Linear Algebra,” Proc. of VECPAR'10, Berkeley, CA, June 2010.  (615.07 KB)
Tomov, S., R. Nath, and J. Dongarra, Accelerating the Reduction to Upper Hessenberg, Tridiagonal, and Bidiagonal Forms through Hybrid GPU-Based Computing,” Parallel Computing, vol. 36, no. 12, pp. 645-654, 00 2010.  (1.39 MB)
Voemel, C., S. Tomov, and J. Dongarra, Divide & Conquer on Hybrid GPU-Accelerated Multicore Systems,” SIAM Journal on Scientific Computing (submitted), August 2010.
Agullo, E., C. Augonnet, J. Dongarra, H. Ltaeif, R. Namyst, S. Thibault, and S. Tomov, Faster, Cheaper, Better - A Hybridization Methodology to Develop Linear Algebra Software for GPUs,” LAPACK Working Note, no. 230, 00 2010.  (334.48 KB)
Ltaeif, H., S. Tomov, R. Nath, and J. Dongarra, Hybrid Multicore Cholesky Factorization with Multiple GPU Accelerators,” IEEE Transaction on Parallel and Distributed Systems (submitted), March 2010.  (3.75 MB)
Nath, R., S. Tomov, and J. Dongarra, An Improved MAGMA GEMM for Fermi GPUs,” International Journal of High Performance Computing, vol. 24, no. 4, pp. 511-515, 00 2010.
Nath, R., S. Tomov, and J. Dongarra, An Improved MAGMA GEMM for Fermi GPUs,” University of Tennessee Computer Science Technical Report, no. UT-CS-10-655 (also LAPACK working note 227), July 2010.  (486.71 KB)
Du, P., P. Luszczek, S. Tomov, and J. Dongarra, Mixed-Tool Performance Analysis on Hybrid Multicore Architectures,” First International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2010), San Diego, CA, September 2010.  (1.24 MB)
Du, P., P. Luszczek, and J. Dongarra, OpenCL Evaluation for Numerical Linear Algebra Library Development,” Symposium on Application Accelerators in High-Performance Computing (SAAHPC '10), Knoxville, TN, July 2010.  (2.69 MB)
Agullo, E., C. Augonnet, J. Dongarra, M. Faverge, H. Ltaeif, S. Thibault, and S. Tomov, QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators,” Proceedings of IPDPS 2011, no. ICL-UT-10-04, Anchorage, AK, October 2010.  (468.17 KB)
Ltaeif, H., S. Tomov, R. Nath, P. Du, and J. Dongarra, A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators,” Proc. of VECPAR'10 (to appear), Berkeley, CA, June 2010.  (870.46 KB)
Tomov, S., J. Dongarra, and M. Baboulin, Towards Dense Linear Algebra for Hybrid GPU Accelerated Manycore Systems,” Parallel Computing, vol. 36, no. 5-6, pp. 232-240, 00 2010.  (606.41 KB)
Du, P., M. Parsons, E. Fuentes, S-L. Shaw, and J. Dongarra, Tuning Principal Component Analysis for GRASS GIS on Multi-core and GPU Architectures,” FOSS4G 2010, Barcelona, Spain, September 2010.  (1.57 MB)
Tomov, S., M. Faverge, P. Luszczek, and J. Dongarra, Using MAGMA with PGI Fortran,” PGI Insider, November 2010.  (176.67 KB)
2009
Tomov, S., and J. Dongarra, Accelerating the Reduction to Upper Hessenberg Form through Hybrid GPU-Based Computing,” University of Tennessee Computer Science Technical Report, UT-CS-09-642 (also LAPACK Working Note 219), May 2009.  (2.37 MB)
Agullo, E., J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaeif, P. Luszczek, and S. Tomov, Numerical Linear Algebra on Emerging Architectures: The PLASMA and MAGMA Projects,” Journal of Physics: Conference Series, vol. 180, 00 2009.  (119.37 KB)
2008
Dongarra, J., S. Moore, G. D. Peterson, S. Tomov, J. Allred, V. Natoli, and D. Richie, Exploring New Architectures in Accelerating CFD for Air Force Applications,” Proceedings of the DoD HPCMP User Group Conference, Seattle, Washington, January 2008.  (492.86 KB)
Baboulin, M., J. Dongarra, and S. Tomov, Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures,” University of Tennessee Computer Science Technical Report, UT-CS-08-615 (also LAPACK Working Note 200), January 2008.  (289.93 KB)
Baboulin, M., S. Tomov, and J. Dongarra, Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures,” PARA 2008, 9th International Workshop on State-of-the-Art in Scientific and Parallel Computing, Trondheim Norway, May 2008.
Tomov, S., J. Dongarra, and M. Baboulin, Towards Dense Linear Algebra for Hybrid GPU Accelerated Manycore Systems,” University of Tennessee Computer Science Technical Report, UT-CS-08-632 (also LAPACK Working Note 210), January 2008.  (606.41 KB)