Publications

Export 221 results:
Filters: First Letter Of Last Name is R  [Clear All Filters]
2019
Altintas, I., K. Marcus, V. Vural, S. Purawat, D. Crawl, G. Antoniu, A. Costan, O. Marcu, P. Balaprakash, R. Cao, et al., A Collection of White Papers from the BDEC2 Workshop in San Diego, CA,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-13: University of Tennessee, October 2019.  (8.25 MB)
Altintas, I., K. Marcus, V. Vural, S. Purawat, D. Crawl, G. Antoniu, A. Costan, O. Marcu, P. Balaprakash, R. Cao, et al., A Collection of White Papers from the BDEC2 Workshop in San Diego, CA,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-13: University of Tennessee, October 2019.  (8.25 MB)
Altintas, I., K. Marcus, V. Vural, S. Purawat, D. Crawl, G. Antoniu, A. Costan, O. Marcu, P. Balaprakash, R. Cao, et al., A Collection of White Papers from the BDEC2 Workshop in San Diego, CA,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-13: University of Tennessee, October 2019.  (8.25 MB)
Benoit, A., A. Cavelan, F. M. Ciorba, V. Le Fèvre, and Y. Robert, Combining Checkpointing and Replication for Reliable Execution of Linear Workflows with Fail-Stop and Silent Errors,” International Journal of Networking and Computing, vol. 9, no. 1, pp. 2-27.  (754.6 KB)
Le Fèvre, V., T. Herault, Y. Robert, A. Bouteiller, A. Hori, G. Bosilca, and J. Dongarra, Comparing the Performance of Rigid, Moldable, and Grid-Shaped Applications on Failure-Prone HPC Platforms,” Parallel Computing, vol. 85, pp. 1–12, July 2019.  (865.18 KB)
Kaya, O., and Y. Robert, Computing Dense Tensor Decompositions with Optimal Dimension Trees,” Algorithmica, vol. 81, issue 5, pp. 2092–2121, May 2019.  (638.4 KB)
Aupy, G., A. Benoit, B. Goglin, L. Pottier, and Y. Robert, Co-Scheduling HPC Workloads on Cache-Partitioned CMP Platforms,” International Journal of High Performance Computing Applications, vol. 33, issue 6, pp. 1221-1239, November 2019.  (930.28 KB)
Danalis, A., H. Jagode, H. Hanumantharayappa, S. Ragate, and J. Dongarra, Counter Inspection Toolkit: Making Sense out of Hardware Performance Events,” 11th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Cham, Switzerland: Springer, February 2019.  (216.39 KB)
Han, L., V. Le Fèvre, L-C. Canon, Y. Robert, and F. Vivien, A Generic Approach to Scheduling and Checkpointing Workflows,” Int. Journal of High Performance Computing Applications, vol. 33, no. 6, pp. 1255-1274, 2019.  (555.01 KB)
Han, L., V. Le Fèvre, L-C. Canon, Y. Robert, and F. Vivien, A Generic Approach to Scheduling and Checkpointing Workflows,” International Journal of High Performance Computing Applications, vol. 33, issue 6, pp. 1255-1274, November 2019.  (555.01 KB)
Herault, T., Y. Robert, G. Bosilca, and J. Dongarra, Generic Matrix Multiplication for Multi-GPU Accelerated Distributed-Memory Platforms over PaRSEC,” ScalA'19: 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Denver, CO, IEEE, November 2019.  (260.69 KB)
Ribizel, T., and H. Anzt, Parallel Selection on GPUs,” Parallel Computing, vol. 91, March 2020, 2019.  (1.43 MB)
Anzt, H., T. Ribizel, G. Flegar, E. Chow, and J. Dongarra, ParILUT – A Parallel Threshold ILU for GPUs,” IEEE International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.  (505.95 KB)
Benoit, A., T. Herault, V. Le Fèvre, and Y. Robert, Replication is More Efficient Than You Think,” The IEEE/ACM Conference on High Performance Computing Networking, Storage and Analysis (SC19), Denver, CO, ACM Press, November 2019.  (975.69 KB)
Aupy, G., A. Gainaru, V. Honoré, P. Raghavan, Y. Robert, and H. Sun, Reservation Strategies for Stochastic Jobs,” 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2019), Rio de Janeiro, Brazil, IEEE Computer Society Press, May 2019.  (808.93 KB)
Aupy, G., A. Gainaru, V. Honoré, P. Raghavan, Y. Robert, and H. Sun, Reservation Strategies for Stochastic Jobs,” 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2019), Rio de Janeiro, Brazil, IEEE Computer Society Press, May 2019.  (808.93 KB)
Gao, Y., L-C. Canon, Y. Robert, and F. Vivien, Scheduling Independent Stochastic Tasks on Heterogeneous Cloud Platforms,” IEEE Cluster 2019, Albuquerque, New Mexico, IEEE Computer Society Press, September 2019.  (651 KB)
Canon, L-C., A K W. Chang, Y. Robert, and F. Vivien, Scheduling Independent Stochastic Tasks under Deadline and Budget Constraints,” International Journal of High Performance Computing Applications, vol. 34, issue 2, pp. 246-264, June 2019.  (427.92 KB)
2018
Dongarra, J., I. Duff, M. Gates, A. Haidar, S. Hammarling, N. J. Higham, J. Hogg, P. Valero Lara, P. Luszczek, M. Zounon, et al., Batched BLAS (Basic Linear Algebra Subprograms) 2018 Specification , July 2018.  (483.05 KB)
Asch, M., T. Moore, R. M. Badia, M. Beck, P. Beckman, T. Bidot, F. Bodin, F. Cappello, A. Choudhary, B. R. de Supinski, et al., Big Data and Extreme-Scale Computing: Pathways to Convergence - Toward a Shaping Strategy for a Future Software and Data Ecosystem for Scientific Inquiry,” The International Journal of High Performance Computing Applications, vol. 32, issue 4, pp. 435–479, July 2018.  (1.29 MB)
Caniou, Y., E. Caron, A K W. Chang, and Y. Robert, Budget-Aware Scheduling Algorithms for Scientific Workflows with Stochastic Task Weights on Heterogeneous IaaS Cloud Platforms,” 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Vancouver, BC, Canada, IEEE, May 2018.  (1.31 MB)
Han, L., L-C. Canon, H. Casanova, Y. Robert, and F. Vivien, Checkpointing Workflows for Fail-Stop Errors,” IEEE Transactions on Computers, vol. 67, issue 8, pp. 1105–1120, August 2018.
Ahrens, J., C. M. Biwer, A. Costan, G. Antoniu, M. S. Pérez, N. Stojanovic, R. Badia, O. Beckstein, G. Fox, S. Jha, et al., A Collection of White Papers from the BDEC2 Workshop in Bloomington, IN,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-15: University of Tennessee, Knoxville, November 2018.  (9.26 MB)
Ahrens, J., C. M. Biwer, A. Costan, G. Antoniu, M. S. Pérez, N. Stojanovic, R. Badia, O. Beckstein, G. Fox, S. Jha, et al., A Collection of White Papers from the BDEC2 Workshop in Bloomington, IN,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-15: University of Tennessee, Knoxville, November 2018.  (9.26 MB)
Ahrens, J., C. M. Biwer, A. Costan, G. Antoniu, M. S. Pérez, N. Stojanovic, R. Badia, O. Beckstein, G. Fox, S. Jha, et al., A Collection of White Papers from the BDEC2 Workshop in Bloomington, IN,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-15: University of Tennessee, Knoxville, November 2018.  (9.26 MB)
Casanova, H., J. Herrmann, and Y. Robert, Computing the Expected Makespan of Task Graphs in the Presence of Silent Errors,” Parallel Computing, vol. 75, pp. 41–60, July 2018.  (2.56 MB)
Benoit, A., A. Cavelan, F. Cappello, P. Raghavan, Y. Robert, and H. Sun, Coping with Silent and Fail-Stop Errors at Scale by Combining Replication and Checkpointing,” Journal of Parallel and Distributed Computing, vol. 122, pp. 209–225, December 2018.  (837 KB)
Benoit, A., A. Cavelan, F. Cappello, P. Raghavan, Y. Robert, and H. Sun, Coping with Silent and Fail-Stop Errors at Scale by Combining Replication and Checkpointing,” Journal of Parallel and Distributed Computing, vol. 122, pp. 209–225, December 2018.  (837 KB)
Aupy, G., A. Benoit, S. Dai, L. Pottier, P. Raghavan, Y. Robert, and M. Shantharam, Co-Scheduling Amdhal Applications on Cache-Partitioned Systems,” International Journal of High Performance Computing Applications, vol. 32, issue 1, pp. 123–138, January 2018.  (672.52 KB)
Aupy, G., A. Benoit, S. Dai, L. Pottier, P. Raghavan, Y. Robert, and M. Shantharam, Co-Scheduling Amdhal Applications on Cache-Partitioned Systems,” International Journal of High Performance Computing Applications, vol. 32, issue 1, pp. 123–138, January 2018.  (672.52 KB)
Aupy, G., A. Benoit, B. Goglin, L. Pottier, and Y. Robert, Co-Scheduling HPC Workloads on Cache-Partitioned CMP Platforms,” Cluster 2018, Belfast, UK, IEEE Computer Society Press, September 2018.  (423.75 KB)
Bosilca, G., A. Bouteiller, T. Herault, V. Le Fèvre, Y. Robert, and J. Dongarra, Distributed Termination Detection for HPC Task-Based Environments,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-14: University of Tennessee, June 2018.
Le Fèvre, V., G. Bosilca, A. Bouteiller, T. Herault, A. Hori, Y. Robert, and J. Dongarra, Do moldable applications perform better on failure-prone HPC platforms?,” 11th Workshop on Resiliency in High Performance Computing in Clusters, Clouds, and Grids, Turin, Italy, Springer Verlag, August 2018.  (360.72 KB)
Bosilca, G., A. Bouteiller, A. Guermouche, T. Herault, Y. Robert, P. Sens, and J. Dongarra, A Failure Detector for HPC Platforms,” The International Journal of High Performance Computing Applications, vol. 32, issue 1, pp. 139–158, January 2018.  (1.04 MB)
Han, L., V. Le Fèvre, L-C. Canon, Y. Robert, and F. Vivien, A Generic Approach to Scheduling and Checkpointing Workflows,” The 47th International Conference on Parallel Processing (ICPP 2018), Eugene, OR, IEEE Computer Society Press, August 2018.  (737.11 KB)
YarKhan, A., G. Ragghianti, J. Dongarra, M. Cawkwell, D. Perez, and A. Voter, Initial Integration and Evaluation of SLATE Parallel BLAS in LATTE,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-07: Innovative Computing Laboratory, University of Tennessee, June 2018.  (366.6 KB)
Kurzak, J., M. Gates, I. Yamazaki, A. Charara, A. YarKhan, J. Finney, G. Ragghianti, P. Luszczek, and J. Dongarra, Linear Systems Performance Report,” SLATE Working Notes, no. 08, ICL-UT-18-08: Innovative Computing Laboratory, University of Tennessee, September 2018.  (1.64 MB)
Benoit, A., A. Cavelan, Y. Robert, and H. Sun, Multi-Level Checkpointing and Silent Error Detection for Linear Workflows,” Journal of Computational Science, vol. 28, pp. 398–415, September 2018.
Herault, T., Y. Robert, A. Bouteiller, D. Arnold, K. Ferreira, G. Bosilca, and J. Dongarra, Optimal Cooperative Checkpointing for Shared High-Performance Computing Platforms,” 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Best Paper Award, Vancouver, BC, Canada, IEEE, May 2018.  (899.3 KB)
Benoit, A., S. Perarnau, L. Pottier, and Y. Robert, A Performance Model to Execute Workflows on High-Bandwidth Memory Architectures,” The 47th International Conference on Parallel Processing (ICPP 2018), Eugene, OR, IEEE Computer Society Press, August 2018.  (868.44 KB)
Aupy, G., and Y. Robert, Scheduling for Fault-Tolerance: An Introduction,” Topics in Parallel and Distributed Computing: Springer International Publishing, pp. 143–170, 2018.
2017
Aupy, G., Y. Robert, and F. Vivien, Assuming failure independence: are we right to be wrong?,” The 3rd International Workshop on Fault Tolerant Systems (FTS), Honolulu, Hawaii, IEEE, September 2017.  (597.11 KB)
Faverge, M., J. Langou, Y. Robert, and J. Dongarra, Bidiagonalization and R-Bidiagonalization: Parallel Tiled Algorithms, Critical Paths and Distributed-Memory Implementation,” IEEE International Parallel and Distributed Processing Symposium (IPDPS), Orlando, FL, IEEE, May 2017.  (328.15 KB)
Han, L., L-C. Canon, H. Casanova, Y. Robert, and F. Vivien, Checkpointing Workflows for Fail-Stop Errors,” IEEE Cluster, Honolulu, Hawaii, IEEE, September 2017.  (400.64 KB)
Aupy, G., A. Benoit, L. Pottier, P. Raghavan, Y. Robert, and M. Shantharam, Co-Scheduling Algorithms for Cache-Partitioned Systems,” 19th Workshop on Advances in Parallel and Distributed Computational Models, Orlando, FL, IEEE Computer Society Press, May 2017.  (584.76 KB)
Aupy, G., A. Benoit, L. Pottier, P. Raghavan, Y. Robert, and M. Shantharam, Co-Scheduling Algorithms for Cache-Partitioned Systems,” 19th Workshop on Advances in Parallel and Distributed Computational Models, Orlando, FL, IEEE Computer Society Press, May 2017.  (584.76 KB)
Kurzak, J., P. Luszczek, I. Yamazaki, Y. Robert, and J. Dongarra, Design and Implementation of the PULSAR Programming System for Large Scale Computing,” Supercomputing Frontiers and Innovations, vol. 4, issue 1, 2017.  (764.96 KB)
Dongarra, J., S. Hammarling, N. J. Higham, S. Relton, P. Valero-Lara, and M. Zounon, The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems,” International Conference on Computational Science (ICCS 2017), Zürich, Switzerland, Elsevier, June 2017.  (446.14 KB)
Kurzak, J., P. Wu, M. Gates, I. Yamazaki, P. Luszczek, G. Ragghianti, and J. Dongarra, Designing SLATE: Software for Linear Algebra Targeting Exascale,” SLATE Working Notes, no. 03, ICL-UT-17-06: Innovative Computing Laboratory, University of Tennessee, October 2017.  (2.8 MB)
Benoit, A., F. Cappello, A. Cavelan, Y. Robert, and H. Sun, Identifying the Right Replication Level to Detect and Correct Silent Errors at Scale,” 2017 Workshop on Fault-Tolerance for HPC at Extreme Scale, Washington, DC, ACM, June 2017.  (865.68 KB)

Pages