Publications

Export 6 results:
Filters: Author is Yu Pei  [Clear All Filters]
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
A
Gates, M., J. Kurzak, P. Luszczek, Y. Pei, and J. Dongarra, Autotuning Batch Cholesky Factorization in CUDA with Interleaved Layout of Matrices,” Parallel and Distributed Processing Symposium Workshops (IPDPSW), Orlando, FL, IEEE, June 2017. DOI: 10.1109/IPDPSW.2017.18
C
Pei, Y., Q. Cao, G. Bosilca, P. Luszczek, V. Eijkhout, and J. Dongarra, Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based Runtime,” 21st IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2020), New Orleans, LA, IEEE, May 2020.  (1.33 MB)
H
Luo, X., W. Wu, G. Bosilca, Y. Pei, Q. Cao, T. Patinyasakdikul, D. Zhong, and J. Dongarra, HAN: A Hierarchical AutotuNed Collective Communication Framework,” IEEE Cluster Conference, Kobe, Japan, IEEE Computer Society Press, September 2020.  (764.05 KB)
P
Cao, Q., Y. Pei, T. Herault, K. Akbudak, A. Mikhalev, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, Performance Analysis of Tile Low-Rank Cholesky Factorization Using PaRSEC Instrumentation Tools,” Workshop on Programming and Performance Visualization Tools (ProTools 19) at SC19, Denver, CO, ACM, November 2019.  (429.55 KB)