Publications | Page 7

2019

Masliah, I., A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, and J. Dongarra, “Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,” Parallel Computing, vol. 81, pp. 1–21, January 2019.

(3.27 MB)

2018

Masliah, I., A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, and J. Dongarra, “Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-09: Innovative Computing Laboratory, University of Tennessee, September 2018.

(3.74 MB)

2017

Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, et al., Accelerating Tensor Contractions in High-Order FEM with MAGMA Batched , Atlanta, GA, SIAM Conference on Computer Science and Engineering (SIAM CSE17), Presentation, March 2017.

(9.29 MB)

Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, A. Haidar, I. Karlin, T. Kolev, I. Masliah, and S. Tomov, “Small Tensor Operations on Advanced Architectures for High-Order Applications,” University of Tennessee Computer Science Technical Report, no. UT-EECS-17-749: Innovative Computing Laboratory, University of Tennessee, April 2017.

(1.09 MB)

Baboulin, M., J. Dongarra, A. Remy, S. Tomov, and I. Yamazaki, “Solving Dense Symmetric Indefinite Systems using GPUs,” Concurrency and Computation: Practice and Experience, vol. 29, issue 9, March 2017.

(1.94 MB)

2016

Anzt, H., M. Baboulin, J. Dongarra, Y. Fournier, F. Hulsemann, A. Khabou, and Y. Wang, “Accelerating the Conjugate Gradient Algorithm with GPU in CFD Simulations,” VECPAR, 2016.

Baboulin, M., J. Dongarra, A. Remy, S. Tomov, and I. Yamazaki, “Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures,” Lecture Notes in Computer Science, vol. 9573: Springer International Publishing, pp. 86-95, September 2015, 2016.

(327.14 KB)

Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, et al., “High-Performance Tensor Contractions for GPUs,” International Conference on Computational Science (ICCS'16), San Diego, CA, June 2016.

(2.36 MB)

Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, et al., “High-Performance Tensor Contractions for GPUs,” University of Tennessee Computer Science Technical Report, no. UT-EECS-16-738: University of Tennessee, January 2016.

(2.36 MB)

2015

Baboulin, M., V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, and S. Tomov, Towards a High-Performance Tensor Algebra Package for Accelerators , Gatlinburg, TN, moky Mountains Computational Sciences and Engineering Conference (SMC15), September 2015.

(1.76 MB)

2014

Baboulin, M., J. Dongarra, and R. Lacroix, “Computing Least Squares Condition Numbers on Hybrid Multicore/GPU Systems,” International Interdisciplinary Conference on Applied Mathematics, Modeling and Computational Science (AMMCS), Waterloo, Ontario, CA, August 2014.

(130.18 KB)

Baboulin, M., D. Becker, G. Bosilca, A. Danalis, and J. Dongarra, “An Efficient Distributed Randomized Algorithm for Solving Large Dense Symmetric Indefinite Linear Systems,” Parallel Computing, vol. 40, issue 7, pp. 213-223, July 2014.

(1.42 MB)

2013

Baboulin, M., J. Dongarra, J. Herrmann, and S. Tomov, “Accelerating Linear System Solutions Using Randomization Techniques,” ACM Transactions on Mathematical Software (also LAWN 246), vol. 39, issue 2, February 2013.

(358.79 KB)

Wang, Y., M. Baboulin, J. Falcou, Y. Fraigneau, and O. Le Maître, “A Parallel Solver for Incompressible Fluid Flows,” International Conference on Computational Science (ICCS 2013), Barcelona, Spain, Elsevier B.V., June 2013.

(588.79 KB)

2012

Baboulin, M., S. Donfack, J. Dongarra, L. Grigori, A. Remi, and S. Tomov, “A Class of Communication-Avoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines,” Proc. of the International Conference on Computational Science (ICCS), vol. 9, pp. 17-26, June 2012.

Baboulin, M., D. Becker, G. Bosilca, A. Danalis, and J. Dongarra, “An efficient distributed randomized solver with application to large dense linear systems,” ICL Technical Report, no. ICL-UT-12-02, July 2012.

(626.26 KB)

Baboulin, M., D. Becker, and J. Dongarra, “A Parallel Tiled Solver for Symmetric Indefinite Systems On Multicore Architectures,” IPDPS 2012, Shanghai, China, May 2012.

(544.09 KB)

Becker, D., M. Baboulin, and J. Dongarra, “Reducing the Amount of Pivoting in Symmetric Indefinite Systems,” Parallel Processing and Applied Mathematics, Lecture Notes in Computer Science (PPAM 2011), vol. 7203: Springer-Verlag Berlin Heidelberg, pp. 133-142, 00 2012.

(145.76 KB)

2011

Baboulin, M., J. Dongarra, J. Herrmann, and S. Tomov, “Accelerating Linear System Solutions Using Randomization Techniques,” INRIA RR-7616 / LAWN #246 (presented at International AMMCS’11), Waterloo, Ontario, Canada, July 2011.

(358.79 KB)

Baboulin, M., D. Becker, and J. Dongarra, “A parallel tiled solver for dense symmetric indefinite systems on multicore architectures,” University of Tennessee Computer Science Technical Report, no. ICL-UT-11-07, October 2011.

(544.2 KB)

Becker, D., M. Baboulin, and J. Dongarra, “Reducing the Amount of Pivoting in Symmetric Indefinite Systems,” University of Tennessee Innovative Computing Laboratory Technical Report, no. ICL-UT-11-06, Knoxville, TN, Submitted to PPAM 2011, May 2011.

(145.76 KB)

2010

Tomov, S., J. Dongarra, and M. Baboulin, “Towards Dense Linear Algebra for Hybrid GPU Accelerated Manycore Systems,” Parallel Computing, vol. 36, no. 5-6, pp. 232-240, 00 2010.

(606.41 KB)

2009

Baboulin, M., A. Buttari, J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek, and S. Tomov, “Accelerating Scientific Computations with Mixed Precision Algorithms,” Computer Physics Communications, vol. 180, issue 12, pp. 2526-2533, December 2009.

(402.69 KB)

Baboulin, M., J. Dongarra, S. Gratton, and J. Langou, “Computing the Conditioning of the Components of a Linear Least-squares Solution,” Numerical Linear Algebra with Applications, vol. 16, no. 7, pp. 517-533, 00 2009.

(374.97 KB)

2008

Baboulin, M., J. Dongarra, S. Gratton, and J. Langou, “Computing the Conditioning of the Components of a Linear Least Squares Solution,” VECPAR '08, High Performance Computing for Computational Science, Toulouse, France, January 2008.

(374.97 KB)

Baboulin, M., J. Demmel, J. Dongarra, S. Tomov, and V. Volkov, Enhancing the Performance of Dense Linear Algebra Solvers on GPUs (in the MAGMA Project) , Austin, TX, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC08), November 2008.

(5.28 MB)

Baboulin, M., S. Tomov, and J. Dongarra, “Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures,” PARA 2008, 9th International Workshop on State-of-the-Art in Scientific and Parallel Computing, Trondheim Norway, May 2008.

Baboulin, M., J. Dongarra, and S. Tomov, “Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures,” University of Tennessee Computer Science Technical Report, UT-CS-08-615 (also LAPACK Working Note 200), January 2008.

(289.93 KB)

Tomov, S., J. Dongarra, and M. Baboulin, “Towards Dense Linear Algebra for Hybrid GPU Accelerated Manycore Systems,” University of Tennessee Computer Science Technical Report, UT-CS-08-632 (also LAPACK Working Note 210), January 2008.

(606.41 KB)

Baboulin, M., and S. Gratton, “Using dual techniques to derive componentwise and mixed condition numbers for a linear functional of a linear least squares solution,” University of Tennessee Computer Science Technical Report, UT-CS-08-622 (also LAPACK Working Note 207), January 2008.

(159.65 KB)

2007

Baboulin, M., J. Dongarra, S. Gratton, and J. Langou, “Computing the Conditioning of the Components of a Linear Least Squares Solution,” University of Tennessee Computer Science Technical Report, no. UT-CS-07-604, (also LAPACK Working Note 193), January 2007.

(374.97 KB)