Publications

Export 783 results:
Filters: Author is Jack Dongarra  [Clear All Filters]
2010
Du, P., R. Weber, P. Luszczek, S. Tomov, G. D. Peterson, and J. Dongarra, From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming,” Parallel Computing (submitted), 00-2010.
Du, P., R. Weber, P. Luszczek, S. Tomov, G. D. Peterson, and J. Dongarra, From CUDA to OpenCL: Towards a Performance-portable Solution for Multiplatform GPU Programming,” Parallel Computing (submitted), August 2010.
Ltaeif, H., S. Tomov, R. Nath, and J. Dongarra, Hybrid Multicore Cholesky Factorization with Multiple GPU Accelerators,” IEEE Transaction on Parallel and Distributed Systems (submitted), March 2010.  (3.75 MB)
Nath, R., S. Tomov, and J. Dongarra, An Improved MAGMA GEMM for Fermi GPUs,” International Journal of High Performance Computing, vol. 24, no. 4, pp. 511-515, 00-2010.
Nath, R., S. Tomov, and J. Dongarra, An Improved MAGMA GEMM for Fermi GPUs,” University of Tennessee Computer Science Technical Report, no. UT-CS-10-655 (also LAPACK working note 227), July 2010.  (486.71 KB)
Turchenko, V., L. Grandinetti, G. Bosilca, and J. Dongarra, Improvement of parallelization efficiency of batch pattern BP training algorithm using Open MPI,” Proceedings of International Conference on Computational Science, ICCS 2010 (to appear), Amsterdam The Netherlands, Elsevier, June 2010.  (125.01 KB)
Dongarra, J., and P. Beckman, International Exascale Software Project Roadmap v1.0,” University of Tennessee Computer Science Technical Report, UT-CS-10-654, May 2010.  (719.74 KB)
Ma, T., G. Bosilca, A. Bouteiller, B. Goglin, J.. Squyres, and J. Dongarra, Kernel Assisted Collective Intra-node Communication Among Multicore and Manycore CPUs,” University of Tennessee Computer Science Technical Report, UT-CS-10-663, November 2010.  (384.75 KB)
Gustavson, F. G., J. Wasniewski, and J. Dongarra, Level-3 Cholesky Kernel Subroutine of a Fully Portable High Performance Minimal Storage Hybrid Format Cholesky Algorithm,” ACM TOMS (submitted), also LAPACK Working Note (LAWN) 211, 00-2010.  (190.2 KB)
Dongarra, J., LINPACK on Future Manycore and GPu Based Systems,” PARA 2010, Reykjavik, Iceland, June 2010.
Ma, T., A. Bouteiller, G. Bosilca, and J. Dongarra, Locality and Topology aware Intra-node Communication Among Multicore CPUs,” Proceedings of the 17th EuroMPI conference, Stuttgart, Germany, LNCS, September 2010.  (327.01 KB)
Du, P., P. Luszczek, S. Tomov, and J. Dongarra, Mixed-Tool Performance Analysis on Hybrid Multicore Architectures,” First International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2010), San Diego, CA, September 2010.  (1.24 MB)
Du, P., P. Luszczek, and J. Dongarra, OpenCL Evaluation for Numerical Linear Algebra Library Development,” Symposium on Application Accelerators in High-Performance Computing (SAAHPC '10), Knoxville, TN, July 2010.  (2.69 MB)
Ltaeif, H., J. Kurzak, and J. Dongarra, Parallel Band Two-Sided Matrix Bidiagonalization for Multicore Architectures,” IEEE Transactions on Parallel and Distributed Systems, pp. 417-423, April 2010.  (208.16 KB)
Tomov, S., W. Lu, J. Bernholc, S. Moore, and J. Dongarra, Performance Evaluation for Petascale Quantum Simulation Tools,” Proceedings of the Cray Users' Group Meeting, Atlanta, GA, May 2010.
Dongarra, J., Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),” University of Tennessee Computer Science Technical Report, UT-CS-89-85, 00-2010.  (6.42 MB)
Proceedings of the International Conference on Computational Science,” ICCS 2010, Amsterdam, Elsevier, May 2010.
Agullo, E., C. Coti, T. Herault, J. Langou, S. Peyronnet, A.. Rezmerita, F. Cappello, and J. Dongarra, QCG-OMPI: MPI Applications on Grids,” Future Generation Computer Systems, vol. 27, no. 4, pp. 357-369, March 2010.  (1.48 MB)
Kurzak, J., and J. Dongarra, QR Factorization for the CELL Processor,” Scientific Programming, vol. 17, no. 1-2, pp. 31-42, 00-2010.  (194.95 KB)
Agullo, E., C. Coti, J. Dongarra, T. Herault, and J. Langou, QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment,” 24th IEEE International Parallel and Distributed Processing Symposium (also LAWN 224), Atlanta, GA, April 2010.  (261.55 KB)
Agullo, E., C. Augonnet, J. Dongarra, M. Faverge, H. Ltaeif, S. Thibault, and S. Tomov, QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators,” Proceedings of IPDPS 2011, no. ICL-UT-10-04, Anchorage, AK, October 2010.  (468.17 KB)
Recent Advances in the Message Passing Interface, Lecture Notes in Computer Science (LNCS),” EuroMPI 2010 Proceedings, vol. 6305, Stuttgart, Germany, Springer, September 2010.
Gustavson, F. G., J. Wasniewski, J. Dongarra, and J. Langou, Rectangular Full Packed Format for Cholesky’s Algorithm: Factorization, Solution, and Inversion,” ACM Transactions on Mathematical Software (TOMS), vol. 37, no. 2, Atlanta, GA, April 2010.  (896.03 KB)
Gustavson, F. G., J. Wasniewski, J. Dongarra, and J. Langou, Rectangular Full Packed Format for Cholesky's Algorithm: Factorization, Solution and Inversion,” ACM Transactions on Mathematical Software (TOMS), vol. 37, no. 2, April 2010.  (896.03 KB)
Bouteiller, A., G. Bosilca, and J. Dongarra, Redesigning the Message Logging Model for High Performance,” Concurrency and Computation: Practice and Experience (online version), June 2010.  (438.42 KB)
Dongarra, J., and P. Luszczek, Reducing the time to tune parallel dense linear algebra routines with partial execution and performance modelling,” University of Tennessee Computer Science Technical Report, no. UT-CS-10-661, October 2010.  (287.87 KB)
Ltaeif, H., S. Tomov, R. Nath, P. Du, and J. Dongarra, A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators,” Proc. of VECPAR'10 (to appear), Berkeley, CA, June 2010.  (870.46 KB)
Song, F., H. Ltaeif, B. Hadri, and J. Dongarra, Scalable Tile Communication-Avoiding QR Factorization on Multicore Cluster Systems,” SC'10, New Orleans, LA, ACM SIGARCH/ IEEE Computer Society, November 2010.  (3.42 MB)
Song, F., H. Ltaeif, B. Hadri, and J. Dongarra, Scalable Tile Communication-Avoiding QR Factorization on Multicore Cluster Systems,” University of Tennessee Computer Science Technical Report, vol. –10-653, April 2010.  (3.42 MB)
Kurzak, J., H. Ltaeif, J. Dongarra, and R. M. Badia, Scheduling Dense Linear Algebra Operations on Multicore Processors,” Concurrency and Computation: Practice and Experience, vol. 22, no. 1, pp. 15-44, January 2010.  (1.23 MB)
Ltaeif, H., J. Kurzak, J. Dongarra, and R. M. Badia, Scheduling Two-sided Transformations using Tile Algorithms on Multicore Architectures,” Journal of Scientific Computing, vol. 18, no. 1, pp. 33-50, 00-2010.  (334.5 KB)
Angskun, T., G. Fagg, G. Bosilca, J. Pjesivac–Grbovic, and J. Dongarra, Self-Healing Network for Scalable Fault-Tolerant Runtime Environments,” Future Generation Computer Systems, vol. 26, no. 3, pp. 479-485, March 2010.  (1.54 MB)
Brady, T., A. Lastovetsky, K. Seymour, M. Guidolin, and J. Dongarra, SmartGridRPC: The new RPC model for high performance Grid Computing and Its Implementation in SmartGridSolve,” Concurrency and Computation: Practice and Experience (to appear), January 2010.  (1.08 MB)
Hadri, B., E. Agullo, and J. Dongarra, Tile QR Factorization with Parallel Panel Processing for Multicore Architectures,” 24th IEEE International Parallel and Distributed Processing Symposium (submitted), 00-2010.  (313.98 KB)
Tomov, S., J. Dongarra, and M. Baboulin, Towards Dense Linear Algebra for Hybrid GPU Accelerated Manycore Systems,” Parallel Computing, vol. 36, no. 5-6, pp. 232-240, 00-2010.  (606.41 KB)
Jagode, H., A. Knuepfer, J. Dongarra, M. Jurenz, M. S. Mueller, and W. E. Nagel, Trace-based Performance Analysis for the Petascale Simulation Code FLASH,” International Journal of High Performance Computing Applications (to appear), 00-2010.  (887.54 KB)
Du, P., M. Parsons, E. Fuentes, S-L. Shaw, and J. Dongarra, Tuning Principal Component Analysis for GRASS GIS on Multi-core and GPU Architectures,” FOSS4G 2010, Barcelona, Spain, September 2010.  (1.57 MB)
Tomov, S., M. Faverge, P. Luszczek, and J. Dongarra, Using MAGMA with PGI Fortran,” PGI Insider, November 2010.  (176.67 KB)
2009
Tomov, S., and J. Dongarra, Accelerating the Reduction to Upper Hessenberg Form Through Hybrid GPU-based Computing,” University of Tennessee Computer Science Technical Report, UT-CS-09-642 (also LAPACK Working Note 219), May 2009.  (2.37 MB)
Demmel, J., J. Dongarra, A. Fox, S. Williams, V. Volkov, and K. Yelick, Accelerating Time-To-Solution for Computational Science and Engineering,” SciDAC Review, 00-2009.  (739.11 KB)
Dongarra, J., G. Bosilca, R. Delmas, and J. Langou, Algorithmic Based Fault Tolerance Applied to High Performance Computing,” Journal of Parallel and Distributed Computing, vol. 69, pp. 410-416, 00-2009.  (313.55 KB)
Song, F., S. Moore, and J. Dongarra, Analytical Modeling and Optimization for Affinity Based Thread Scheduling on Multicore Systems,” IEEE Cluster 2009, New Orleans, August 2009.  (395.53 KB)
Buttari, A., J. Langou, J. Kurzak, and J. Dongarra, A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures,” Parallel Computing, vol. 35, pp. 38-53, 00-2009.  (274.74 KB)
Agullo, E., B. Hadri, H. Ltaeif, and J. Dongarra, Comparative Study of One-Sided Factorizations with Multiple Software Packages on Multi-Core Hardware,” 2009 International Conference for High Performance Computing, Networking, Storage, and Analysis (SC '09) (to appear), 00-2009.  (515.63 KB)
Computational Science – ICCS 2009, Proceedings of the 9th International Conference,” Lecture Notes in Computer Science: Theoretical Computer Science and General Issues, vol. -, no. 5544-5545, Baton Rouge, LA, May 2009.
Baboulin, M., J. Dongarra, S. Gratton, and J. Langou, Computing the Conditioning of the Components of a Linear Least-squares Solution,” Numerical Linear Algebra with Applications, vol. 16, no. 7, pp. 517-533, 00-2009.  (374.97 KB)
Bosilca, G., C. Coti, T. Herault, P. Lemariner, and J. Dongarra, Constructing resiliant communication infrastructure for runtime environments,” Innovative Computing Laboratory Technical Report, no. ICL-UT-09-02, July 2009.  (463.71 KB)
Lemariner, P., G. Bosilca, C. Coti, T. Herault, and J. Dongarra, Constructing Resilient Communication Infrastructure for Runtime Environments,” ParCo 2009, Lyon France, September 2009.
Kurzak, J., H. Ltaeif, J. Dongarra, and R. M. Badia, Dependency-Driven Scheduling of Dense Matrix Factorizations on Shared-Memory Systems,” PPAM 2009, Poland, September 2009.
Song, F., A. YarKhan, and J. Dongarra, Dynamic Task Scheduling for Linear Algebra Algorithms on Distributed-Memory Multicore Systems,” International Conference for High Performance Computing, Networking, Storage, and Analysis (SC '09), Portland, OR, November 2009.  (502.49 KB)

Pages