Publications

Export 232 results:
Filters: First Letter Of Last Name is C  [Clear All Filters]
2020
Lopez, F., E. Chow, S. Tomov, and J. Dongarra, Asynchronous SGD for DNN training on Shared-memory Parallel Architectures,” Workshop on Scalable Deep Learning over Parallel And Distributed Infrastructures (ScaDL 2020), May 2020.  (188.51 KB)
Lopez, F., E. Chow, S. Tomov, and J. Dongarra, Asynchronous SGD for DNN Training on Shared-Memory Parallel Architectures,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-04: University of Tennessee, Knoxville, March 2020.  (188.51 KB)
Kolev, T., P. Fischer, A. Abdelfattah, S. Ananthan, V. Barra, N. Beams, R. Bleile, J. Brown, R. Carson, J-S. Camier, et al., CEED ECP Milestone Report: Improve Performance and Capabilities of CEED-Enabled ECP Applications on Summit/Sierra : Zenodo, 2020. DOI: 10.5281/zenodo.3860804
Kolev, T., P. Fischer, A. Abdelfattah, S. Ananthan, V. Barra, N. Beams, R. Bleile, J. Brown, R. Carson, J-S. Camier, et al., CEED ECP Milestone Report: Improve Performance and Capabilities of CEED-Enabled ECP Applications on Summit/Sierra : Zenodo, 2020. DOI: 10.5281/zenodo.3860804
Kolev, T., P. Fischer, A. Abdelfattah, S. Ananthan, V. Barra, N. Beams, R. Bleile, J. Brown, R. Carson, J-S. Camier, et al., CEED ECP Milestone Report: Improve Performance and Capabilities of CEED-Enabled ECP Applications on Summit/Sierra : Zenodo, 2020. DOI: 10.5281/zenodo.3860804
Pei, Y., Q. Cao, G. Bosilca, P. Luszczek, V. Eijkhout, and J. Dongarra, Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based Runtime,” 21st IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2020), New Orleans, LA, IEEE, May 2020.  (1.33 MB)
Cao, Q., Y. Pei, K. Akbudak, A. Mikhalev, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications,” Platform for Advanced Scientific Computing Conference (PASC20), Geneva, Switzerland, ACM, June 2020. DOI: 10.1145/3394277.3401846  (2.71 MB)
Han, L., L-C. Canon, J. Liu, Y. Robert, and F. Vivien, Improved Energy-Aware Strategies for Periodic Real-Time Tasks under Reliability Constraints,” 40th IEEE Real-Time Systems Symposium (RTSS 2019), York, UK, IEEE Press, February 2020.
Anzt, H., Y-C. Chen, T. Cojean, J. Dongarra, G. Flegar, R. Nayak, E. S. Quintana-Orti, Y. Tsai, and W. Wang, Load-balancing Sparse Matrix Vector Product Kernels on GPUs,” ACM Transactions on Parallel Computing, issue 2, March 2020. DOI: 10.1145/3380930  (5.64 MB)
Anzt, H., Y-C. Chen, T. Cojean, J. Dongarra, G. Flegar, R. Nayak, E. S. Quintana-Orti, Y. Tsai, and W. Wang, Load-balancing Sparse Matrix Vector Product Kernels on GPUs,” ACM Transactions on Parallel Computing, issue 2, March 2020. DOI: 10.1145/3380930  (5.64 MB)
Gates, M., A. Charara, A. YarKhan, D. Sukkari, M. Al Farhan, and J. Dongarra, Performance Tuning SLATE,” SLATE Working Notes, no. 14, ICL-UT-20-01: Innovative Computing Laboratory, University of Tennessee, January 2020.  (1.29 MB)
Gates, M., J. Kurzak, A. YarKhan, A. Charara, J. Finney, D. Sukkari, M. Al Farhan, I. Yamazaki, P. Wu, and J. Dongarra, SLATE Tutorial , Houston, TX, 2020 ECP Annual Meeting, February 2020.  (12.14 MB)
Zhong, D., P. Shamis, Q. Cao, G. Bosilca, and J. Dongarra, Using Arm Scalable Vector Extension to optimize Open MPI,” 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID 2020), Melbourne, Australia, IEEE/ACM, May 2020.
2019
Tomov, S., A. Abdelfattah, V. Barra, N. Beams, J. Brown, J-S. Camier, V. Dobrev, J. Dongarra, Y. Dudouit, P. Fischer, et al., CEED ECP Milestone Report: Performance Tuning of CEED Software and 1st and 2nd Wave Apps : Zenodo, October 2019. DOI: 10.5281/zenodo.3477618  (8.31 MB)
Davis, J., T. Gao, S. Chandrasekaran, H. Jagode, A. Danalis, P. Balaji, J. Dongarra, and M. Taufer, Characterization of Power Usage and Performance in Data-Intensive Applications using MapReduce over MPI,” 2019 International Conference on Parallel Computing (ParCo2019), Prague, Czech Republic, September 2019.
Badia, R. M., M. Beck, F. Bodin, T. Boku, F. Cappello, A. Choudhary, C. Costa, E. Deelman, N. Ferrier, K. Fujisawa, et al., A Collection of Presentations from the BDEC2 Workshop in Kobe, Japan,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-09: University of Tennessee, Knoxville, February 2019.  (58.85 MB)
Badia, R. M., M. Beck, F. Bodin, T. Boku, F. Cappello, A. Choudhary, C. Costa, E. Deelman, N. Ferrier, K. Fujisawa, et al., A Collection of Presentations from the BDEC2 Workshop in Kobe, Japan,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-09: University of Tennessee, Knoxville, February 2019.  (58.85 MB)
Badia, R. M., M. Beck, F. Bodin, T. Boku, F. Cappello, A. Choudhary, C. Costa, E. Deelman, N. Ferrier, K. Fujisawa, et al., A Collection of Presentations from the BDEC2 Workshop in Kobe, Japan,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-09: University of Tennessee, Knoxville, February 2019.  (58.85 MB)
Antoniu, G., A. Costan, O. Marcu, M. S. Pérez, N. Stojanovic, R. M. Badia, M. Vázquez, S. Girona, M. Beck, T. Moore, et al., A Collection of White Papers from the BDEC2 Workshop in Poznan, Poland,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-10: University of Tennessee, Knoxville, May 2019.  (5.82 MB)
Altintas, I., K. Marcus, V. Vural, S. Purawat, D. Crawl, G. Antoniu, A. Costan, O. Marcu, P. Balaprakash, R. Cao, et al., A Collection of White Papers from the BDEC2 Workshop in San Diego, CA,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-13: University of Tennessee, October 2019.  (8.25 MB)
Altintas, I., K. Marcus, V. Vural, S. Purawat, D. Crawl, G. Antoniu, A. Costan, O. Marcu, P. Balaprakash, R. Cao, et al., A Collection of White Papers from the BDEC2 Workshop in San Diego, CA,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-13: University of Tennessee, October 2019.  (8.25 MB)
Altintas, I., K. Marcus, V. Vural, S. Purawat, D. Crawl, G. Antoniu, A. Costan, O. Marcu, P. Balaprakash, R. Cao, et al., A Collection of White Papers from the BDEC2 Workshop in San Diego, CA,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-13: University of Tennessee, October 2019.  (8.25 MB)
Altintas, I., K. Marcus, V. Vural, S. Purawat, D. Crawl, G. Antoniu, A. Costan, O. Marcu, P. Balaprakash, R. Cao, et al., A Collection of White Papers from the BDEC2 Workshop in San Diego, CA,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-13: University of Tennessee, October 2019.  (8.25 MB)
Altintas, I., K. Marcus, V. Vural, S. Purawat, D. Crawl, G. Antoniu, A. Costan, O. Marcu, P. Balaprakash, R. Cao, et al., A Collection of White Papers from the BDEC2 Workshop in San Diego, CA,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-13: University of Tennessee, October 2019.  (8.25 MB)
Benoit, A., A. Cavelan, F. M. Ciorba, V. Le Fèvre, and Y. Robert, Combining Checkpointing and Replication for Reliable Execution of Linear Workflows with Fail-Stop and Silent Errors,” International Journal of Networking and Computing, vol. 9, no. 1, pp. 2-27.  (754.6 KB)
Benoit, A., A. Cavelan, F. M. Ciorba, V. Le Fèvre, and Y. Robert, Combining Checkpointing and Replication for Reliable Execution of Linear Workflows with Fail-Stop and Silent Errors,” International Journal of Networking and Computing, vol. 9, no. 1, pp. 2-27.  (754.6 KB)
Gruetzmacher, T., T. Cojean, G. Flegar, F. Göbel, and H. Anzt, A Customized Precision Format Based on Mantissa Segmentation for Accelerating Sparse Linear Algebra,” Concurrency and Computation: Practice and Experience, vol. 40319, issue 262, January 2019. DOI: 10.1002/cpe.5418
Han, L., V. Le Fèvre, L-C. Canon, Y. Robert, and F. Vivien, A Generic Approach to Scheduling and Checkpointing Workflows,” International Journal of High Performance Computing Applications, vol. 33, issue 6, pp. 1255-1274, November 2019. DOI: 10.1177/1094342019866891  (555.01 KB)
Han, L., V. Le Fèvre, L-C. Canon, Y. Robert, and F. Vivien, A Generic Approach to Scheduling and Checkpointing Workflows,” Int. Journal of High Performance Computing Applications, vol. 33, no. 6, pp. 1255-1274, 2019.  (555.01 KB)
Kurzak, J., M. Gates, A. Charara, A. YarKhan, and J. Dongarra, Least Squares Solvers for Distributed-Memory Machines with GPU Accelerators,” ACM International Conference on Supercomputing (ICS '19), Phoenix, Arizona, ACM, pp. 117–126, June 2019. DOI: 10.1145/3324989.3325719  (1.63 MB)
Kurzak, J., M. Gates, A. Charara, A. YarKhan, I. Yamazaki, and J. Dongarra, Linear Systems Solvers for Distributed-Memory Machines with GPU Accelerators,” Euro-Par 2019: Parallel Processing, vol. 11725: Springer, pp. 495–506, August 2019. DOI: 10.1007/978-3-030-29400-7_35
Ng, L., S. Chen, A. Gessinger, D. Nichols, S. Cheng, A. Meenasorna, K. Wong, S. Tomov, A. Haidar, E. D'Azevedo, et al., MagmaDNN 0.2 High-Performance Data Analytics for Manycore GPUs and CPUs : University of Tennessee, January 2019. DOI: 10.13140/RG.2.2.14906.64961  (7.84 MB)
Ng, L., S. Chen, A. Gessinger, D. Nichols, S. Cheng, A. Meenasorna, K. Wong, S. Tomov, A. Haidar, E. D'Azevedo, et al., MagmaDNN 0.2 High-Performance Data Analytics for Manycore GPUs and CPUs : University of Tennessee, January 2019. DOI: 10.13140/RG.2.2.14906.64961  (7.84 MB)
Nichols, D., K. Wong, S. Tomov, L. Ng, S. Chen, and A. Gessinger, MagmaDNN: Accelerated Deep Learning Using MAGMA,” Practice and Experience in Advanced Research Computing (PEARC ’19), Chicago, IL, ACM, July 2019.  (1.09 MB)
Anzt, H., T. Ribizel, G. Flegar, E. Chow, and J. Dongarra, ParILUT – A Parallel Threshold ILU for GPUs,” IEEE International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019. DOI: 10.1109/IPDPS.2019.00033  (505.95 KB)
Cao, Q., Y. Pei, T. Herault, K. Akbudak, A. Mikhalev, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, Performance Analysis of Tile Low-Rank Cholesky Factorization Using PaRSEC Instrumentation Tools,” Workshop on Programming and Performance Visualization Tools (ProTools 19) at SC19, Denver, CO, ACM, November 2019.  (429.55 KB)
Yamazaki, I., E. Chow, A. Bouteiller, and J. Dongarra, Performance of Asynchronous Optimized Schwarz with One-sided Communication,” Parallel Computing, vol. 86, pp. 66-81, August 2019. DOI: 10.1016/j.parco.2019.05.004  (3.09 MB)
Gao, Y., L-C. Canon, Y. Robert, and F. Vivien, Scheduling Independent Stochastic Tasks on Heterogeneous Cloud Platforms,” IEEE Cluster 2019, Albuquerque, New Mexico, IEEE Computer Society Press, September 2019.  (651 KB)
Canon, L-C., A K W. Chang, Y. Robert, and F. Vivien, Scheduling Independent Stochastic Tasks under Deadline and Budget Constraints,” International Journal of High Performance Computing Applications, vol. 34, issue 2, pp. 246-264, June 2019. DOI: 10.1177/1094342019852135  (427.92 KB)
Canon, L-C., A K W. Chang, Y. Robert, and F. Vivien, Scheduling Independent Stochastic Tasks under Deadline and Budget Constraints,” International Journal of High Performance Computing Applications, vol. 34, issue 2, pp. 246-264, June 2019. DOI: 10.1177/1094342019852135  (427.92 KB)
Gates, M., J. Kurzak, A. Charara, A. YarKhan, and J. Dongarra, SLATE: Design of a Modern Distributed and Accelerated Linear Algebra Library,” International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), Denver, CO, ACM, November 2019. DOI: 10.1145/3295500.3356223  (2.01 MB)
Gates, M., J. Kurzak, A. Charara, A. YarKhan, and J. Dongarra, SLATE: Design of a Modern Distributed and Accelerated Linear Algebra Library , Denver, CO, International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), November 2019.  (16.19 MB)
Charara, A., M. Gates, J. Kurzak, A. YarKhan, and J. Dongarra, SLATE Developers' Guide,” SLATE Working Notes, no. 11, ICL-UT-19-02: Innovative Computing Laboratory, University of Tennessee, December 2019.  (1.63 MB)
Charara, A., J. Dongarra, M. Gates, J. Kurzak, and A. YarKhan, SLATE Mixed Precision Performance Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-03: University of Tennessee, April 2019.  (1.04 MB)
Gates, M., A. Charara, J. Kurzak, and J. Dongarra, SLATE Users' Guide,” SLATE Working Notes, no. 10, ICL-UT-19-01: Innovative Computing Laboratory, University of Tennessee, January 2019.
Kurzak, J., M. Gates, A. Charara, A. YarKhan, and J. Dongarra, SLATE Working Note 12: Implementing Matrix Inversions,” SLATE Working Notes, no. 12, ICL-UT-19-04: Innovative Computing Laboratory, University of Tennessee, June 2019.  (1.95 MB)
Gates, M., M. Al Farhan, A. Charara, J. Kurzak, D. Sukkari, A. YarKhan, and J. Dongarra, SLATE Working Note 13: Implementing Singular Value and Symmetric/Hermitian Eigenvalue Solvers,” SLATE Working Notes, no. 13, ICL-UT-19-07: Innovative Computing Laboratory, University of Tennessee, September 2019.  (2.71 MB)
Anzt, H., T. Cojean, and E. Kuhn, Towards a New Peer Review Concept for Scientific Computing ensuring Technical Quality, Software Sustainability, and Result Reproducibility,” Proceedings in Applied Mathematics and Mechanics, vol. 19, issue 1, November 2019. DOI: 10.1002/pamm.201900490
Anzt, H., Y. Chen Chen, T. Cojean, J. Dongarra, G. Flegar, P. Nayak, E. S. Quintana-Orti, Y. M. Tsai, and W. Wang, Towards Continuous Benchmarking,” Platform for Advanced Scientific Computing Conference (PASC 2019), Zurich, Switzerland, ACM Press, June 2019. DOI: 10.1145/3324989.3325719  (1.51 MB)
Anzt, H., Y. Chen Chen, T. Cojean, J. Dongarra, G. Flegar, P. Nayak, E. S. Quintana-Orti, Y. M. Tsai, and W. Wang, Towards Continuous Benchmarking,” Platform for Advanced Scientific Computing Conference (PASC 2019), Zurich, Switzerland, ACM Press, June 2019. DOI: 10.1145/3324989.3325719  (1.51 MB)

Pages