Publications

Export 1028 results:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
C
Aupy, G., A. Benoit, S. Dai, L. Pottier, P. Raghavan, Y. Robert, and M. Shantharam, Co-Scheduling Amdhal Applications on Cache-Partitioned Systems,” International Journal of High Performance Computing Applications, vol. 32, issue 1, pp. 123–138, January 2018. DOI: 10.1177/1094342017710806  (672.52 KB)
Aupy, G., A. Benoit, B. Goglin, L. Pottier, and Y. Robert, Co-Scheduling HPC Workloads on Cache-Partitioned CMP Platforms,” Cluster 2018, Belfast, UK, IEEE Computer Society Press, September 2018.  (423.75 KB)
Aupy, G., A. Benoit, B. Goglin, L. Pottier, and Y. Robert, Co-scheduling HPC workloads on cache-partitioned CMP platforms,” Int. Journal of High Performance Computing Applications, 2019.  (930.28 KB)
Danalis, A., H. Jagode, H. Hanumantharayappa, S. Ragate, and J. Dongarra, Counter Inspection Toolkit: Making Sense out of Hardware Performance Event,” 11th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Cham, Switzerland: Springer, September 2019. DOI: 10.1007/978-3-030-11987-4_2  (216.39 KB)
Jia, Y., P. Luszczek, G. Bosilca, and J. Dongarra, CPU-GPU Hybrid Bidiagonal Reduction With Soft Error Resilience,” ScalA '13 Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Montpellier, France, November 2013.  (238.58 KB)
Agarwal, P., R. A.. Alexander, E.. Apra, S. Balay, A. S. Bland, J. Colgan, E. D'Azevedo, J. Dongarra, T. Dunigan, M. Fahey, et al., Cray X1 Evaluation Status Report,” Oak Ridge National Laboratory Report, vol. /-2004/13, January 2004.  (817.33 KB)
Mellor-Crummey, J., P. Beckman, J. Dongarra, B. Miller, and K. Yelick, Creating Software Technology to Harness the Power of Leadership-class Computing Systems,” DOE SciDAC Review (to appear), June 2007.  (617.02 KB)
Song, F., and F. Wolf, CUBE User Manual,” ICL Technical Report, no. ICL-UT-04-01, February 2004.  (429.12 KB)
Jagode, H., and J. Hein, Custom assignment of MPI ranks for parallel multi-dimensional FFTs: Evaluation of BG/P versus BG/L,” Proceedings of the 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA-08), Sydney, Australia, IEEE Computer Society, pp. 271-283, January 2008.  (2.6 MB)
Gruetzmacher, T., T. Cojean, G. Flegar, F. Göbel, and H. Anzt, A Customized Precision Format Based on Mantissa Segmentation for Accelerating Sparse Linear Algebra,” Concurrency and Computation: Practice and Experience, vol. 40319, issue 262, January 2019. DOI: 10.1002/cpe.5418
D
Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Lemariner, and J. Dongarra, DAGuE: A generic distributed DAG Engine for High Performance Computing.,” Parallel Computing, vol. 38, no. 1-2: Elsevier, pp. 27-51, 00 2012.  (830.85 KB)
Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Lemariner, and J. Dongarra, DAGuE: A generic distributed DAG engine for high performance computing,” Innovative Computing Laboratory Technical Report, no. ICL-UT-10-01, April 2010.  (830.85 KB)
Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Lemariner, and J. Dongarra, DAGuE: A Generic Distributed DAG Engine for High Performance Computing,” Proceedings of the Workshops of the 25th IEEE International Symposium on Parallel and Distributed Processing (IPDPS 2011 Workshops), Anchorage, Alaska, USA, IEEE, pp. 1151-1158, 00 2011.  (830.85 KB)
Dongarra, J., R. Graybill, W. Harrod, R. Lucas, E. Lusk, P. Luszczek, J. McMahon, A. Snavely, J. Vetter, K. Yelick, et al., DARPA's HPCS Program: History, Models, Tools, Languages,” in Advances in Computers, vol. 72: Elsevier, January 2008.  (3.61 MB)
Haidar, A., J. Kurzak, G. Pichon, and M. Faverge, A Data Flow Divide and Conquer Algorithm for Multicore Architecture,” 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Hyderabad, India, IEEE, May 2015.  (535.44 KB)
Bouteiller, A., G. Bosilca, T. Herault, and J. Dongarra, Data Movement Interfaces to Support Dataflow Runtimes,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-03: University of Tennessee, May 2018.  (210.94 KB)
Jagode, H., Dataflow Programming Paradigms for Computational Chemistry Methods,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-01, Knoxville, TN, University of Tennessee, May 2017.
Pjesivac–Grbovic, J., G. Bosilca, G. Fagg, T. Angskun, and J. Dongarra, Decision Trees and MPI Collective Algorithm Selection Problem,” Euro-Par 2007, Rennes, France, Springer, pp. 105–115, August 2007.  (552.94 KB)
Yamazaki, I., S. Tomov, and J. Dongarra, Deflation Strategies to Improve the Convergence of Communication-Avoiding GMRES,” 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, New Orleans, LA, November 2014.  (465.52 KB)
Tomov, S., and J. Dongarra, Dense Linear Algebra for Hybrid GPU-based Systems,” Scientific Computing with Multicore and Accelerators, Boca Raton, Florida, CRC Press, 2010.
Dongarra, J., J. Kurzak, P. Luszczek, and S. Tomov, Dense Linear Algebra on Accelerated Multicore Hardware,” High Performance Scientific Computing: Algorithms and Applications, London, UK, Springer-Verlag, 00 2012.
Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Luszczek, and J. Dongarra, Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach,” Scalable Computing and Communications: Theory and Practice: John Wiley & Sons, pp. 699-735, March 2013.  (1.01 MB)
Tomov, S., R. Nath, H. Ltaeif, and J. Dongarra, Dense Linear Algebra Solvers for Multicore with GPU Accelerators,” Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on, Atlanta, GA, pp. 1-8, 2010. DOI: 10.1109/IPDPSW.2010.5470941  (1 MB)
Tomov, S., Dense Linear Algebra Solvers for Multicore with GPU Accelerators , Atlanta, GA, International Parallel and Distributed Processing Symposium (IPDPS 2010), April 2010.  (956.68 KB)
Baboulin, M., J. Dongarra, A. Remy, S. Tomov, and I. Yamazaki, Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures,” Lecture Notes in Computer Science, vol. 9573: Springer International Publishing, pp. 86-95, September 2015, 2016. DOI: 10.1007/978-3-319-32149-3_9  (327.14 KB)
Kurzak, J., H. Ltaeif, J. Dongarra, and R. M. Badia, Dependency-Driven Scheduling of Dense Matrix Factorizations on Shared-Memory Systems,” PPAM 2009, Poland, September 2009.
Casanova, H., J. Plank, M. Beck, and J. Dongarra, Deploying Fault-tolerance and Task Migration with NetSolve,” Future Generation Computer Systems, vol. 15, no. 5-6: Elsevier, pp. 745-755, October 1999.  (236 KB)
Roche, K., and J. Dongarra, Deploying Parallel Numerical Library Routines to Cluster Computing in a Self Adapting Fashion,” Parallel Computing: Advances and Current Issues:Proceedings of the International Conference ParCo2001, London, England, Imperial College Press, January 2002.  (381.89 KB)
Tomov, S., A. Haidar, A. Ayala, D. Schultz, and J. Dongarra, Design and Implementation for FFT-ECP on Distributed Accelerated Systems,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-05: University of Tennessee, April 2019.  (3.19 MB)
Yamazaki, I., J. Kurzak, P. Luszczek, and J. Dongarra, Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime,” Workshop on Large-Scale Parallel Processing, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (398.16 KB)
Raman, G., and J. Dongarra, Design and Implementation of NetSolve using DCOM as the Remoting Layer,” University of Tennessee Computer Science Department Technical Report, no. UT-CS-00-440, May 2000.  (65.45 KB)
D'Azevedo, E., and J. Dongarra, The Design and Implementation of the Parallel Out of Core ScaLAPACK LU, QR, and Cholesky Factorization Routines,” Concurrency: Practice and Experience, vol. 12, no. 15, pp. 1481-1493, January 2000.  (374.18 KB)
Kurzak, J., P. Luszczek, I. Yamazaki, Y. Robert, and J. Dongarra, Design and Implementation of the PULSAR Programming System for Large Scale Computing,” Supercomputing Frontiers and Innovations, vol. 4, issue 1, 2017. DOI: 10.14529/jsfi170101
Dongarra, J., S. Hammarling, N. J. Higham, S. Relton, P. Valero-Lara, and M. Zounon, The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems,” International Conference on Computational Science (ICCS 2017), Zürich, Switzerland, Elsevier, June 2017. DOI: DOI:10.1016/j.procs.2017.05.138
Guidry, M., and A. Haidar, On the Design, Autotuning, and Optimization of GPU Kernels for Kinetic Network Simulations Using Fast Explicit Integration and GPU Batched Computation , Oak Ridge, TN, Joint Institute for Computational Sciences Seminar Series, Presentation, September 2015.  (17.25 MB)
Kabir, K., A. Haidar, S. Tomov, and J. Dongarra, On the Design, Development, and Analysis of Optimized Matrix-Vector Multiplication Routines for Coprocessors,” ISC High Performance 2015, Frankfurt, Germany, July 2015.  (1.49 MB)
Cao, C., T. Herault, G. Bosilca, and J. Dongarra, Design for a Soft Error Resilient Dynamic Task-based Runtime,” ICL Technical Report, no. ICL-UT-14-04: University of Tennessee, November 2014.  (2.61 MB)
Cao, C., G. Bosilca, T. Herault, and J. Dongarra, Design for a Soft Error Resilient Dynamic Task-based Runtime,” 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Hyderabad, India, IEEE, May 2015.  (2.31 MB)
You, H., Q. Liu, Z. Li, and S. Moore, The Design of an Auto-tuning I/O Framework on Cray XT5 System,” Cray Users Group Conference (CUG'11) (Best Paper Finalist), Fairbanks, Alaska, May 2011.  (459.57 KB)
Luszczek, P., and J. Dongarra, Design of an Interactive Environment for Numerically Intensive Parallel Linear Algebra Calculations,” International Conference on Computational Science, Poland, Springer Verlag, June 2004. DOI: 10.1007/978-3-540-25944-2_35  (88.31 KB)
Haidar, A., A. Abdelfattah, M. Zounon, P. Wu, S. Pranesh, S. Tomov, and J. Dongarra, The Design of Fast and Energy-Efficient Linear Solvers: On the Potential of Half-Precision Arithmetic and Iterative Refinement Techniques,” International Conference on Computational Science (ICCS 2018), vol. 10860, Wuxi, China, Springer, pp. 586–600, June 2018. DOI: 10.1007/978-3-319-93698-7_45
Faverge, M., J. Herrmann, J. Langou, B. Lowery, Y. Robert, and J. Dongarra, Designing LU-QR Hybrid Solvers for Performance and Stability,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014. DOI: 10.1109/IPDPS.2014.108  (4.2 MB)
Faverge, M., J. Herrmann, J. Langou, B. Lowery, Y. Robert, and J. Dongarra, Designing LU-QR hybrid solvers for performance and stability,” University of Tennessee Computer Science Technical Report (also LAWN 282), no. ut-eecs-13-719: University of Tennessee, October 2013.  (4.11 MB)
Kurzak, J., P. Wu, M. Gates, I. Yamazaki, P. Luszczek, G. Ragghianti, and J. Dongarra, Designing SLATE: Software for Linear Algebra Targeting Exascale,” SLATE Working Notes, no. 3, ICL-UT-17-06: Innovative Computing Laboratory, University of Tennessee, October 2017.  (2.8 MB)
Fürlinger, K., and S. Moore, Detection and Analysis of Iterative Behavior in Parallel Applications,” Proceedings of the 2008 International Conference on Computational Science (ICCS 2008), vol. 5103, Krakow, Poland, pp. 261-267, January 2008.  (141.02 KB)
Arnold, D., and J. Dongarra, Developing an Architecture to Support the Implementation and Development of Scientific Computing Applications,” to appear in Proceedings of Working Conference 8: Software Architecture for Scientific Computing Applications, Ottawa, Canada, October 2000.  (176.25 KB)
Kelleher, Jr., M., Development of the PICMSS NetSolve Service,” ICL Technical Report, no. ICL-UT-02-04, April 2002.  (328.44 KB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, On the Development of Variable Size Batched Computation for Heterogeneous Parallel Architectures,” The 17th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2016), IPDPS 2016, Chicago, IL, IEEE, May 2016.  (708.62 KB)
Marin, G., C. McCurdy, and J. Vetter, Diagnosis and Optimization of Application Prefetching Performance,” Proceedings of the 27th ACM International Conference on Supercomputing (ICS '13), Eugene, Oregon, USA, ACM Press, June 2013. DOI: 10.1145/2464996.2465014  (827.31 KB)
Dongarra, J., Z. Chen, G. Bosilca, and J. Langou, Disaster Survival Guide in Petascale Computing: An Algorithmic Approach,” in Petascale Computing: Algorithms and Applications (to appear): Chapman & Hall - CRC Press, 00 2007.  (260.18 KB)

Pages