Export 1029 results:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
Yamazaki, I., A. Abdelfattah, A. Ida, S. Ohshima, S. Tomov, R. Yokota, and J. Dongarra, Analyzing Performance of BiCGStab with Hierarchical Matrix on GPU Clusters,” IEEE International Parallel and Distributed Processing Symposium (IPDPS), Vancouver, BC, Canada, IEEE, May 2018.  (1.37 MB)
Yamazaki, I., M. Hoemmen, P. Luszczek, and J. Dongarra, Improving Performance of GMRES by Reducing Communication and Pipelining Global Collectives,” Proceedings of The 18th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2017), Best Paper Award, Orlando, FL, June 2017.  (453.66 KB)
Yamazaki, I., S. Rajamanickam, E. G. Boman, M. Hoemmen, M. A. Heroux, and S. Tomov, Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU Cluster,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 14), New Orleans, LA, IEEE, November 2014.
Yamazaki, I., S. Nooshabadi, S. Tomov, and J. Dongarra, Structure-aware Linear Solver for Realtime Convex Optimization for Embedded Systems,” IEEE Embedded Systems Letters, vol. 9, issue 3, pp. 61–64, May 2017. DOI: 10.1109/LES.2017.2700401  (339.11 KB)
Yamazaki, I., T. Dong, R. Solcà, S. Tomov, J. Dongarra, and T. C. Schulthess, Tridiagonalization of a dense symmetric matrix on multiple GPUs and its application to symmetric eigenvalue problems,” Concurrency and Computation: Practice and Experience, October 2013.  (1.71 MB)
Yamazaki, I., D. Becker, J. Dongarra, A. Druinsky, I.. Peled, S. Toledo, G. Ballard, J. Demmel, and O. Schwartz, Implementing a Blocked Aasen’s Algorithm with a Dynamic Scheduler on Multicore Architectures,” IPDPS 2013 (submitted), Boston, MA, 00 2013.  (1.22 MB)
Yamazaki, I., J. Kurzak, P. Luszczek, and J. Dongarra, Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime,” Workshop on Large-Scale Parallel Processing, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (398.16 KB)
YarKhan, A., J. Dongarra, and K. Seymour, GridSolve: The Evolution of Network Enabled Solver,” Grid-Based Problem Solving Environments: IFIP TC2/WG 2.5 Working Conference on Grid-Based Problem Solving Environments (Prescott, AZ, July 2006): Springer, pp. 215-226, 00 2007.  (377.48 KB)
YarKhan, A., and J. Dongarra, Experiments with Scheduling Using Simulated Annealing in a Grid Environment,” Grid Computing - GRID 2002, Third International Workshop, vol. 2536, Baltimore, MD, Springer, pp. 232-242, November 2002.  (66.91 KB)
YarKhan, A., J. Kurzak, P. Luszczek, and J. Dongarra, Porting the PLASMA Numerical Library to the OpenMP Standard,” International Journal of Parallel Programming, June 2016. DOI: 10.1007/s10766-016-0441-6  (1.66 MB)
YarKhan, A., G. Ragghianti, J. Dongarra, M. Cawkwell, D. Perez, and A. Voter, Initial Integration and Evaluation of SLATE Parallel BLAS in LATTE,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-07: Innovative Computing Laboratory, University of Tennessee, June 2018.  (366.6 KB)
YarKhan, A., J. Kurzak, A. Abdelfattah, and J. Dongarra, An Empirical View of SLATE Algorithms on Scalable Hybrid System,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-08: University of Tennessee, Knoxville, September 2019.  (441.16 KB)
YarKhan, A., J. Kurzak, and J. Dongarra, QUARK Users' Guide: QUeueing And Runtime for Kernels,” University of Tennessee Innovative Computing Laboratory Technical Report, no. ICL-UT-11-02, 00 2011.  (247.12 KB)
YarKhan, A., K. Seymour, K. Sagi, Z. Shi, and J. Dongarra, Recent Developments in GridSolve,” International Journal of High Performance Computing Applications (Special Issue: Scheduling for Large-Scale Heterogeneous Platforms), vol. 20, no. 1: Sage Science Press, 00 2006.  (496.69 KB)
YarKhan, A., Dynamic Task Execution on Shared and Distributed Memory Architectures , 2012.  (3.29 MB)
YarKhan, A., and J. Dongarra, Biological Sequence Alignment on the Computational Grid Using the GrADS Framework,” Future Generation Computing Systems, vol. 21, no. 6: Elsevier, pp. 980-986, June 2005.  (147.29 KB)
YarKhan, A., A. Haidar, C. Cao, P. Luszczek, S. Tomov, and J. Dongarra, Cholesky Across Accelerators,” 17th IEEE International Conference on High Performance Computing and Communications (HPCC 2015), Elizabeth, NJ, IEEE, August 2015.
Yi, Q., K. Kennedy, H. You, K. Seymour, and J. Dongarra, Automatic Blocking of QR and LU Factorizations for Locality,” 2nd ACM SIGPLAN Workshop on Memory System Performance (MSP 2004), Washington, DC, ACM, June 2004. DOI: 10.1145/1065895.1065898  (212.77 KB)
You, H., Q. Liu, Z. Li, and S. Moore, The Design of an Auto-tuning I/O Framework on Cray XT5 System,” Cray Users Group Conference (CUG'11) (Best Paper Finalist), Fairbanks, Alaska, May 2011.  (459.57 KB)
You, H., K. Seymour, and J. Dongarra, An Effective Empirical Search Method for Automatic Software Tuning,” ICL Technical Report, no. ICL-UT-05-02, January 2005.  (74.66 KB)
You, H., K. Seymour, J. Dongarra, and S. Moore, Empirical Tuning of a Multiresolution Analysis Kernel using a Specialized Code Generator,” ICL Technical Report, no. ICL-UT-07-02, January 2007.  (123.34 KB)
You, H., K. Seymour, J. Dongarra, and S. Moore, Automated Empirical Tuning of a Multiresolution Analysis Kernel,” ICL Technical Report, no. ICL-UT-07-01, pp. 10, January 2007.  (120.7 KB)
You, H., B. Rekapalli, Q. Liu, and S. Moore, Autotuned Parallel I/O for Highly Scalable Biosequence Analysis,” TeraGrid'11, Salt Lake City, Utah, July 2011.  (275.34 KB)
Youseff, L., K. Seymour, H. You, J. Dongarra, and R. Wolski, The Impact of Paravirtualized Memory Hierarchy on Linear Algebra Computational Kernels and Software,” ACM/IEEE International Symposium on High Performance Distributed Computing, Boston, MA., June 2008.  (403.89 KB)
Youseff, L., K. Seymour, H. You, D. Zagorodnov, J. Dongarra, and R. Wolski, Paravirtualization Effect on Single- and Multi-threaded Memory-Intensive Linear Algebra Software,” Cluster Computing Journal: Special Issue on High Performance Distributed Computing, vol. 12, no. 2: Springer Netherlands, pp. 101-122, 00 2009.  (451.07 KB)
Zaitsev, D., S. Tomov, and J. Dongarra, Solving Linear Diophantine Systems on Parallel Architectures,” IEEE Transactions on Parallel and Distributed Systems, October 2018. DOI: 10.1109/TPDS.2018.2873354
Zhao, Y., L. Wan, W. Wu, G. Bosilca, R. Vuduc, J. Ye, W. Tang, and Z. Xu, Efficient Communications in Training Large Scale Neural Networks,” ACM MultiMedia Workshop 2017, Mountain View, CA, ACM, October 2017.  (1.41 MB)
Zhong, D., A. Bouteiller, X. Luo, and G. Bosilca, Runtime Level Failure Detection and Propagation in HPC Systems,” European MPI Users' Group Meeting (EuroMPI '19), Zürich, Switzerland, ACM, September 2019. DOI: 10.1145/3343211.3343225  (1.11 MB)
Zunger, A., A. Franceschetti, G. Bester, W. B. Jones, K. Kim, P. A. Graf, L-W. Wang, A. Canning, O. Marques, C. Voemel, et al., Predicting the electronic properties of 3D, million-atom semiconductor nanostructure architectures,” J. Phys.: Conf. Ser. 46, vol. :101088/1742-6596/46/1/040, pp. 292-298, January 2006.  (644.1 KB)