Accelerating NWChem Coupled Cluster through dataflow-based Execution,” 11th International Conference on Parallel Processing and Applied Mathematics (PPAM 2015), Krakow, Poland, Springer International Publishing, September 2015.“
Counter Inspection Toolkit: Making Sense out of Hardware Performance Event,” 11th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Cham, Switzerland: Springer, September 2019.“
Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs,” International Conference on Parallel Processing (ICPP'11), Taipei, Taiwan, ACM, September 2011. DOI: 10.1109/ICPP.2011.71“
PaRSEC in Practice: Optimizing a Legacy Chemistry Application through Distributed Task-Based Execution,” 2015 IEEE International Conference on Cluster Computing, Chicago, IL, IEEE, September 2015.“
Power-aware Computing: Measurement, Control, and Performance Analysis for Intel Xeon Phi,” 2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Best Paper Finalist, Waltham, MA, IEEE, September 2017.“
Software-Defined Events through PAPI,” 24th International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS), Rio de Janeiro, Brazil, IEEE, May 2019.“
What it Takes to keep PAPI Instrumental for the HPC Community,” 1st Workshop on Sustainable Scientific Software (CW3S19), Collegeville, Minnesota, July 2019.“
Custom assignment of MPI ranks for parallel multi-dimensional FFTs: Evaluation of BG/P versus BG/L,” Proceedings of the 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA-08), Sydney, Australia, IEEE Computer Society, pp. 271-283, January 2008.“
A Holistic Approach for Performance Measurement and Analysis for Petascale Applications,” ICCS 2009 Joint Workshop: Tools for Program Development and Analysis in Computational Science and Software Engineering for Large-Scale Computing, vol. 2009, Baton Rouge, Louisiana, Springer-Verlag Berlin Heidelberg 2009, pp. 686-695, May 2009.“
Modeling the Office of Science Ten Year Facilities Plan: The PERI Architecture Tiger Team,” SciDAC 2009, Journal of Physics: Conference Series, vol. 180(2009)012039, San Diego, California, IOP Publishing, July 2009.“
Power Management and Event Verification in PAPI,” Tools for High Performance Computing 2015: Proceedings of the 9th International Workshop on Parallel Tools for High Performance Computing, September 2015, Dresden, Germany, Dresden, Germany, Springer International Publishing, pp. pp. 41-51, 2016. DOI: 10.1007/978-3-319-39589-0_4“
Accelerating NWChem Coupled Cluster through dataflow-based Execution,” The International Journal of High Performance Computing Applications, vol. 32, issue 4, pp. 540--551, July 2018. DOI: 10.1177/1094342016672543“
Accelerating NWChem Coupled Cluster through Dataflow-Based Execution,” The International Journal of High Performance Computing Applications, pp. 1–13, January 2017. DOI: 10.1177/1094342016672543“
Collecting Performance Data with PAPI-C,” Tools for High Performance Computing 2009, 3rd Parallel Tools Workshop, Dresden, Germany, Springer Berlin / Heidelberg, pp. 157-173, May 2010. DOI: 10.1007/978-3-642-11261-4_11“
Evaluation of Dataflow Programming Models for Electronic Structure Theory,” Concurrency and Computation: Practice and Experience: Special Issue on Parallel and Distributed Algorithms, vol. 2018, issue e4490, pp. 1–20, May 2018. DOI: 10.1002/cpe.4490“
Impact of Quad-core Cray XT4 System and Software Stack on Scientific Computation,” Euro-Par 2009, Lecture Notes in Computer Science, vol. 5704/2009, Delft, The Netherlands, Springer Berlin / Heidelberg, pp. 334-344, August 2009.“
Investigating Power Capping toward Energy-Efficient Scientific Applications,” Concurrency Computation: Practice and Experience, vol. 2018, issue e4485, pp. 1-14, April 2018. DOI: 10.1002/cpe.4485“
I/O Performance Analysis for the Petascale Simulation Code FLASH,” ISC'09, Hamburg, Germany, June 2009.“
PAPI Software-Defined Events for in-Depth Performance Analysis,” The International Journal of High Performance Computing Applications, 2019.“
Trace-based Performance Analysis for the Petascale Simulation Code FLASH,” International Journal of High Performance Computing Applications (to appear), 00 2010.“
Power-aware Computing on GPGPUs , Gatlinburg, TN, Fall Creek Falls Conference, Poster, September 2011.
Does your tool support PAPI SDEs yet? , Tahoe City, CA, 13th Scalable Tools Workshop, July 2019.
PAPI: Counting outside the Box , Barcelona, Spain, 8th JLESC Meeting, April 2018.
PAPI's New Software-Defined Events for In-Depth Performance Analysis , Lyon, France, CCDSC 2018: Workshop on Clusters, Clouds, and Data for Scientific Computing, September 2018.
PAPI's new Software-Defined Events for in-depth Performance Analysis , Dresden, Germany, 13th Parallel Tools Workshop, September 2019.
Power-Aware HPC on Intel Xeon Phi KNL Processors , Frankfurt, Germany, ISC High Performance (ISC17), Intel Booth Presentation, June 2017.
Software-Defined Events through PAPI for In-Depth Analysis of Application Performance , Basel, Switzerland, 5th Platform for Advanced Scientific Computing Conference (PASC18), July 2018.
Understanding Native Event Semantics , Knoxville, TN, 9th JLESC Workshop, April 2019.
What it Takes to keep PAPI Instrumental for the HPC Community , Collegeville, MN, The 2019 Collegeville Workshop on Sustainable Scientific Software (CW3S19), July 2019.
Is your scheduling good? How would you know? , Bordeaux, France, 14th Scheduling for Large Scale Systems Workshop, June 2019.
Dataflow Programming Paradigms for Computational Chemistry Methods,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-01, Knoxville, TN, University of Tennessee, May 2017.“
Software-Defined Events (SDEs) in MAGMA-Sparse,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-12: University of Tennessee, December 2018.“
Task placement of parallel multi-dimensional FFTs on a mesh communication network,” University of Tennessee Computer Science Technical Report, no. UT-CS-08-613, January 2008.“
Trace-based Performance Analysis for the Petascale Simulation Code FLASH,” Innovative Computing Laboratory Technical Report, no. ICL-UT-09-01, April 2009.“