Publications

Gates, M., A. Charara, A. YarKhan, D. Sukkari, M. Al Farhan, and J. Dongarra, “Performance Tuning SLATE,” SLATE Working Notes, no. 14, ICL-UT-20-01: Innovative Computing Laboratory, University of Tennessee, January 2020.

(1.29 MB)

Bailey, D., J. Chame, C. Chen, J. Dongarra, M. Hall, J. K. Hollingsworth, P. D. Hovland, S. Moore, K. Seymour, J. Shin, et al., “PERI Auto-tuning,” Proc. SciDAC 2008, vol. 125, Seatlle, Washington, Journal of Physics, January 2008.

(873.75 KB)

Bouteiller, A., G. Bosilca, and J. Dongarra, “Plan B: Interruption of Ongoing MPI Operations to Support Failure Recovery,” 22nd European MPI Users' Group Meeting, Bordeaux, France, ACM, September 2015.

(543.32 KB)

Abalenkovs, M., N. Bagherpour, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Relton, J. Sistek, D. Stevens, et al., “PLASMA 17 Performance Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-11: University of Tennessee, June 2017.

(7.57 MB)

Abalenkovs, M., N. Bagherpour, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Relton, J. Sistek, D. Stevens, et al., “PLASMA 17.1 Functionality Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-10: University of Tennessee, June 2017.

(1.8 MB)

Luszczek, P., and J. Dongarra, The PLASMA Library on CORAL Systems and Beyond (Poster) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.

(550.86 KB)

Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, P. Wu, I. Yamazaki, A. YarKhan, M. Abalenkovs, N. Bagherpour, et al., “PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP,” ACM Transactions on Mathematical Software, vol. 45, issue 2, June 2019.

(7.5 MB)

Kurzak, J., A. Buttari, P. Luszczek, and J. Dongarra, “The PlayStation 3 for High Performance Scientific Computing,” University of Tennessee Computer Science Technical Report, no. UT-CS-08-608, January 2008.

(2.45 MB)

Kurzak, J., A. Buttari, P. Luszczek, and J. Dongarra, “The PlayStation 3 for High Performance Scientific Computing,” Computing in Science and Engineering, pp. 80-83, January 2008.

(2.45 MB)

Castain, R., J. Hursey, A. Bouteiller, and D. Solt, “PMIx: Process Management for Exascale Environments,” Parallel Computing, vol. 79, pp. 9–29, January 2018.

Castain, R. H., D. Solt, J. Hursey, and A. Bouteiller, “PMIx: Process Management for Exascale Environments,” Proceedings of the 24th European MPI Users' Group Meeting, New York, NY, USA, ACM, pp. 14:1–14:10, 2017.

Eijkhout, V., “Polynomial Acceleration of Optimised Multi-grid Smoothers; Basic Theory,” ICL Technical Report, vol. 156, no. ICL-UT-02-03, January 2002.

(100.66 KB)

Dongarra, J., A. Haidar, O. Hernandez, S. Tomov, and M G. Venkata, “POMPEI: Programming with OpenMP4 for Exascale Investigations,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-09: University of Tennessee, December 2017.

(1.1 MB)

Dongarra, J., M. Gates, A. Haidar, Y. Jia, K. Kabir, P. Luszczek, and S. Tomov, “Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi,” PPAM 2013, Warsaw, Poland, September 2013.

(284.97 KB)

Browne, S., J. Dongarra, N. Garner, K. London, and P. Mucci, “A Portable Programming Interface for Performance Evaluation on Modern Processors,” University of Tennessee Computer Science Technical Report, UT-CS-00-444, July 2000.

(655.17 KB)

Browne, S., J. Dongarra, N. Garner, G. Ho, and P. Mucci, “A Portable Programming Interface for Performance Evaluation on Modern Processors,” The International Journal of High Performance Computing Applications, vol. 14, no. 3, pp. 189-204, September 2000.

(655.17 KB)

Beck, M., R. Chawla, B. Dempsey, and T. Moore, “Portable Representation of Internet Content Channels in I2-DSI,” 4th Intl. Web Caching Workshop, San Diego, CA, March 1999.

Tsai, Y. M., T. Cojean, and H. Anzt, “Porting Sparse Linear Algebra to Intel GPUs,” Euro-Par 2021: Parallel Processing Workshops, vol. 13098, Lisbon, Portugal, Springer International Publishing, pp. 57 - 68, June 2022.

YarKhan, A., J. Kurzak, P. Luszczek, and J. Dongarra, “Porting the PLASMA Numerical Library to the OpenMP Standard,” International Journal of Parallel Programming, June 2016.

(1.66 MB)

Bland, W., A. Bouteiller, T. Herault, G. Bosilca, and J. Dongarra, “Post-failure recovery of MPI communication capability: Design and rationale,” International Journal of High Performance Computing Applications, vol. 27, issue 3, pp. 244 - 254, January 2013.

(285.77 KB)

Kasichayanula, K., D. Terpstra, P. Luszczek, S. Tomov, S. Moore, and G. D. Peterson, “Power Aware Computing on GPUs,” SAAHPC '12 (Best Paper Award), Argonne, IL, July 2012.

(658.06 KB)

Jagode, H., A. YarKhan, A. Danalis, and J. Dongarra, “Power Management and Event Verification in PAPI,” Tools for High Performance Computing 2015: Proceedings of the 9th International Workshop on Parallel Tools for High Performance Computing, September 2015, Dresden, Germany, Dresden, Germany, Springer International Publishing, pp. pp. 41-51, 2016.

(565.14 KB)

McCraw, H., J. Ralph, A. Danalis, and J. Dongarra, “Power Monitoring with PAPI for Extreme Scale Architectures and Dataflow-based Programming Models,” 2014 IEEE International Conference on Cluster Computing, no. ICL-UT-14-04, Madrid, Spain, IEEE, September 2014.

(3.45 MB)

Bosilca, G., J. Dongarra, and H. Ltaeif, “Power Profiling of Cholesky and QR Factorizations on Distributed Memory Systems,” Third International Conference on Energy-Aware High Performance Computing, Hamburg, Germany, September 2012.

(290.27 KB)

Haidar, A., H. Jagode, A. YarKhan, P. Vaccaro, S. Tomov, and J. Dongarra, “Power-aware Computing: Measurement, Control, and Performance Analysis for Intel Xeon Phi,” 2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Best Paper Finalist, Waltham, MA, IEEE, September 2017.

(908.84 KB)

Kasichayanula, K., H. You, S. Moore, S. Tomov, H. Jagode, and M. Johnson, Power-aware Computing on GPGPUs , Gatlinburg, TN, Fall Creek Falls Conference, Poster, September 2011.

(2.89 MB)

Haidar, A., H. Jagode, A. YarKhan, P. Vaccaro, S. Tomov, and J. Dongarra, Power-Aware HPC on Intel Xeon Phi KNL Processors , Frankfurt, Germany, ISC High Performance (ISC17), Intel Booth Presentation, June 2017.

(5.87 MB)

Lively, C., X. Wu, V. Taylor, S. Moore, H-C. Chang, C-Y. Su, and K. Cameron, “Power-Aware Prediction Models of Hybrid (MPI/OpenMP) Scientific Applications,” International Conference on Energy-Aware High Performance Computing (EnA-HPC 2011), Hamburg, Germany, September 2011.

(479.49 KB)

Herault, T., A. Bouteiller, G. Bosilca, M. Gamell, K. Teranishi, M. Parashar, and J. Dongarra, “Practical Scalable Consensus for Pseudo-Synchronous Distributed Systems: Formal Proof,” Innovative Computing Laboratory Technical Report, no. ICL-UT-15-01, April 2015.

(570.97 KB)

Herault, T., A. Bouteiller, G. Bosilca, M. Gamell, K. Teranishi, M. Parashar, and J. Dongarra, “Practical Scalable Consensus for Pseudo-Synchronous Distributed Systems,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.

(550.96 KB)

Anzt, H., M. Gates, J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, “Preconditioned Krylov Solvers on GPUs,” Parallel Computing, June 2017.

(1.19 MB)

Aggarwal, I., P. Nayak, A. Kashi, and H. Anzt, “Preconditioners for Batched Iterative Linear Solvers on GPUs,” Smoky Mountains Computational Sciences and Engineering Conference, vol. 169075: Springer Nature Switzerland, pp. 38 - 53, January 2023.

Hunold, S., A. Bhatele, G. Bosilca, and P. Knees, “Predicting MPI Collective Communication Performance Using Machine Learning,” 2020 IEEE International Conference on Cluster Computing (CLUSTER), Kobe, Japan, IEEE, September 2020.

(619.68 KB)

Zunger, A., A. Franceschetti, G. Bester, W. B. Jones, K. Kim, P. A. Graf, L-W. Wang, A. Canning, O. Marques, C. Voemel, et al., “Predicting the electronic properties of 3D, million-atom semiconductor nanostructure architectures,” J. Phys.: Conf. Ser. 46, vol. :101088/1742-6596/46/1/040, pp. 292-298, January 2006.

(644.1 KB)

Funk, Y., M. Götz, and H. Anzt, “Prediction of Optimal Solvers for Sparse Linear Systems Using Deep Learning,” 2022 SIAM Conference on Parallel Processing for Scientific Computing (PP), Philadelphia, PA, Society for Industrial and Applied Mathematics, pp. 14 - 24.

Kurzak, J., P. Luszczek, S. Tomov, and J. Dongarra, “Preliminary Results of Autotuning GEMM Kernels for the NVIDIA Kepler Architecture,” LAWN 267, 00 2012.

(1.14 MB)

Dongarra, J., and J. Langou, “The Problem with the Linpack Benchmark Matrix Generator,” University of Tennessee Computer Science Technical Report, UT-CS-08-621 (also LAPACK Working Note 206), June 2008.

(136.41 KB)

Langou, J., and J. Dongarra, “The Problem with the Linpack Benchmark Matrix Generator,” International Journal of High Performance Computing Applications, vol. 23, no. 1, pp. 5-14, 00 2009.

(136.41 KB)

“Proceedings of the International Conference on Computational Science,” ICCS 2010, Amsterdam, Elsevier, May 2010.

Ma, T., T. Herault, G. Bosilca, and J. Dongarra, “Process Distance-aware Adaptive MPI Collective Communications,” IEEE Int'l Conference on Cluster Computing (Cluster 2011), Austin, Texas, 00 2011.

Fagg, G., E. Gabriel, Z. Chen, T. Angskun, G. Bosilca, J. Pjesivac–Grbovic, and J. Dongarra, “Process Fault-Tolerance: Semantics, Design and Applications for High Performance Computing,” International Journal for High Performance Applications and Supercomputing (to appear), April 2004.

(186.9 KB)

Hoemmen, M., and I. Yamazaki, Production Implementations of Pipelined & Communication-Avoiding Iterative Linear Solvers , Tokyo, Japan, SIAM Conference on Parallel Processing for Scientific Computing, March 2018.

(2.34 MB)

Ltaeif, H., P. Luszczek, and J. Dongarra, “Profiling High Performance Dense Linear Algebra Algorithms on Multicore Architectures for Power and Energy Efficiency,” International Conference on Energy-Aware High Performance Computing (EnA-HPC 2011), Hamburg, Germany, September 2011.

(1.27 MB)

Kurzak, J., P. Luszczek, M. Faverge, and J. Dongarra, “Programming the LU Factorization for a Multicore System with Accelerators,” Proceedings of VECPAR’12, Kobe, Japan, April 2012.

(414.33 KB)

Abdelfattah, A., S. Tomov, and J. Dongarra, “Progressive Optimization of Batched LU Factorization on GPUs,” IEEE High Performance Extreme Computing Conference (HPEC’19), Waltham, MA, IEEE, September 2019.

(299.38 KB)

Wong, K., S. Tomov, and J. Dongarra, “Project-Based Research and Training in High Performance Data Sciences, Data Analytics, and Machine Learning,” The Journal of Computational Science Education, vol. 11, issue 1, pp. 36-44, January 2020.

(4.4 MB)

Bland, W., G. Bosilca, A. Bouteiller, T. Herault, and J. Dongarra, “A Proposal for User-Level Failure Mitigation in the MPI-3 Standard,” University of Tennessee Electrical Engineering and Computer Science Technical Report, no. ut-cs-12-693: University of Tennessee, February 2012.

(159.46 KB)

Tang, Y., G. Fagg, and J. Dongarra, “Proposal of MPI operation level Checkpoint/Rollback and one implementation,” Proceedings of IEEE CCGrid 2006: IEEE Computer Society, January 2006.

(277.27 KB)

Eijkhout, V., and E. Fuentes, “A Proposed Standard for Matrix Metadata,” Innovative Computing Laboratory Technical Report, no. ICL-UT-03-02, Submitted to ACM TOMS, November 2003.

(13.39 KB)

Demmel, J., J. Dongarra, B.. Parlett, W. Kahan, M. Gu, D. Bindel, Y. Hida, X. Li, O. Marques, J. E. Riedy, et al., “Prospectus for the Next LAPACK and ScaLAPACK Libraries,” PARA 2006, Umea, Sweden, June 2006.

(460.11 KB)

Main menu

Pages