%0 Generic %D 2022 %T Extending MAGMA Portability with OneAPI %A Anna Fortenberry %A Stanimire Tomov %A Kwai Wong %I The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC22), ACM Student Research Competition %C Dallas, TX %8 2022-11 %G eng %U https://sc22.supercomputing.org/proceedings/src_poster/poster_files/spostu105s3-file1.pdf %0 Generic %D 2021 %T Linear Algebra Prepara.on for Emergent Neural Network Architectures: MAGMA, BLAS, and Batched GPU Computing %A Stanimire Tomov %A Kwai Wong %A Rocco Febbo %A Julian Halloy %I LAPENNA Workshop %C Virtual %8 2021-11 %G eng %0 Generic %D 2020 %T How to Build Your Own Deep Neural Network %A Kwai Wong %A Stanimire Tomov %A Daniel Nichols %A Rocco Febbo %A Florent Lopez %A Julian Halloy %A Xianfeng Ma %K AI %K Deep Neural Networks %K dense linear algebra %K HPC %K ML %I PEARC20 %8 2020-07 %G eng %0 Generic %D 2020 %T Integrating Deep Learning in Domain Science at Exascale (MagmaDNN) %A Stanimire Tomov %A Kwai Wong %A Jack Dongarra %A Rick Archibald %A Edmond Chow %A Eduardo D'Azevedo %A Markus Eisenbach %A Rocco Febbo %A Florent Lopez %A Daniel Nichols %A Junqi Yin %X We will present some of the current challenges in the design and integration of deep learning AI with traditional HPC simulations. We evaluate existing packages for readiness to run efficiently deep learning models and applications on large scale HPC systems, identify challenges, and propose new asynchronous parallelization and optimization techniques for current large-scale heterogeneous systems and up-coming exascale systems. These developments, along with existing HPC AI software capabilities, have been integrated in MagmaDNN, an open source HPC deep learning framework. Many deep learning frameworks are targeted towards data scientists and fall short in providing quality integration into existing HPC workflows. This paper discusses the necessities of an HPC deep learning framework and how these can be provided, e.g., as in MagmaDNN, through a deep integration with existing HPC libraries such as MAGMA and its modular memory management, MPI, CuBLAS, CuDNN, MKL, and HIP. Advancements are also illustrated through the use of algorithmic enhancements in reduced and mixed-precision and asynchronous optimization methods. Finally, we present illustrations and potential solutions on enhancing traditional compute and data intensive applications at ORNL and UTK with AI. The approaches and future challenges are illustrated on materials science, imaging, and climate applications. %I DOD HPCMP seminar %C virtual %8 2020-12 %G eng %0 Generic %D 2020 %T Integrating Deep Learning in Domain Sciences at Exascale %A Rick Archibald %A Edmond Chow %A Eduardo D'Azevedo %A Jack Dongarra %A Markus Eisenbach %A Rocco Febbo %A Florent Lopez %A Daniel Nichols %A Stanimire Tomov %A Kwai Wong %A Junqi Yin %X This paper presents some of the current challenges in designing deep learning artificial intelligence (AI) and integrating it with traditional high-performance computing (HPC) simulations. We evaluate existing packages for their ability to run deep learning models and applications on large-scale HPC systems e ciently, identify challenges, and propose new asynchronous parallelization and optimization techniques for current large-scale heterogeneous systems and upcoming exascale systems. These developments, along with existing HPC AI software capabilities, have been integrated into MagmaDNN, an open-source HPC deep learning framework. Many deep learning frameworks are targeted at data scientists and fall short in providing quality integration into existing HPC workflows. This paper discusses the necessities of an HPC deep learning framework and how those needs can be provided (e.g., as in MagmaDNN) through a deep integration with existing HPC libraries, such as MAGMA and its modular memory management, MPI, CuBLAS, CuDNN, MKL, and HIP. Advancements are also illustrated through the use of algorithmic enhancements in reduced- and mixed-precision, as well as asynchronous optimization methods. Finally, we present illustrations and potential solutions for enhancing traditional compute- and data-intensive applications at ORNL and UTK with AI. The approaches and future challenges are illustrated in materials science, imaging, and climate applications. %B Innovative Computing Laboratory Technical Report %I University of Tennessee %8 2020-08 %G eng %0 Conference Paper %B 2020 Smoky Mountains Computational Sciences and Engineering Conference (SMC 2020) %D 2020 %T Integrating Deep Learning in Domain Sciences at Exascale %A Rick Archibald %A Edmond Chow %A Eduardo D'Azevedo %A Jack Dongarra %A Markus Eisenbach %A Rocco Febbo %A Florent Lopez %A Daniel Nichols %A Stanimire Tomov %A Kwai Wong %A Junqi Yin %X This paper presents some of the current challenges in designing deep learning artificial intelligence (AI) and integrating it with traditional high-performance computing (HPC) simulations. We evaluate existing packages for their ability to run deep learning models and applications on large-scale HPC systems e ciently, identify challenges, and propose new asynchronous parallelization and optimization techniques for current large-scale heterogeneous systems and upcoming exascale systems. These developments, along with existing HPC AI software capabilities, have been integrated into MagmaDNN, an open-source HPC deep learning framework. Many deep learning frameworks are targeted at data scientists and fall short in providing quality integration into existing HPC workflows. This paper discusses the necessities of an HPC deep learning framework and how those needs can be provided (e.g., as in MagmaDNN) through a deep integration with existing HPC libraries, such as MAGMA and its modular memory management, MPI, CuBLAS, CuDNN, MKL, and HIP. Advancements are also illustrated through the use of algorithmic enhancements in reduced- and mixed-precision, as well as asynchronous optimization methods. Finally, we present illustrations and potential solutions for enhancing traditional compute- and data-intensive applications at ORNL and UTK with AI. The approaches and future challenges are illustrated in materials science, imaging, and climate applications. %B 2020 Smoky Mountains Computational Sciences and Engineering Conference (SMC 2020) %8 2020-08 %G eng %0 Conference Paper %B ISC High Performance %D 2019 %T Hands-on Research and Training in High-Performance Data Sciences, Data Analytics, and Machine Learning for Emerging Environments %A Kwai Wong %A Stanimire Tomov %A Jack Dongarra %B ISC High Performance %I Springer International Publishing %C Frankfurt, Germany %8 2019-06 %G eng %0 Generic %D 2019 %T MagmaDNN 0.2 High-Performance Data Analytics for Manycore GPUs and CPUs %A Lucien Ng %A Sihan Chen %A Alex Gessinger %A Daniel Nichols %A Sophia Cheng %A Anu Meenasorna %A Kwai Wong %A Stanimire Tomov %A Azzam Haidar %A Eduardo D'Azevedo %A Jack Dongarra %I University of Tennessee %8 2019-01 %G eng %R 10.13140/RG.2.2.14906.64961 %0 Conference Paper %B Practice and Experience in Advanced Research Computing (PEARC ’19) %D 2019 %T MagmaDNN: Accelerated Deep Learning Using MAGMA %A Daniel Nichols %A Kwai Wong %A Stanimire Tomov %A Lucien Ng %A Sihan Chen %A Alex Gessinger %B Practice and Experience in Advanced Research Computing (PEARC ’19) %I ACM %C Chicago, IL %8 2019-07 %G eng %0 Conference Paper %B ISC High Performance %D 2019 %T MagmaDNN: Towards High-Performance Data Analytics and Machine Learning for Data-Driven Scientific Computing %A Daniel Nichols %A Natalie-Sofia Tomov %A Frank Betancourt %A Stanimire Tomov %A Kwai Wong %A Jack Dongarra %X In this paper, we present work towards the development of a new data analytics and machine learning (ML) framework, called MagmaDNN. Our main goal is to provide scalable, high-performance data analytics and ML solutions for scientific applications running on current and upcoming heterogeneous many-core GPU-accelerated architectures. To this end, since many of the functionalities needed are based on standard linear algebra (LA) routines, we designed MagmaDNN to derive its performance power from the MAGMA library. The close integration provides the fundamental (scalable high-performance) LA routines available in MAGMA as a backend to MagmaDNN. We present some design issues for performance and scalability that are specific to ML using Deep Neural Networks (DNN), as well as the MagmaDNN designs towards overcoming them. In particular, MagmaDNN uses well established HPC techniques from the area of dense LA, including task-based parallelization, DAG representations, scheduling, mixed-precision algorithms, asynchronous solvers, and autotuned hyperparameter optimization. We illustrate these techniques and their incorporation and use to outperform other frameworks, currently available. %B ISC High Performance %I Springer International Publishing %C Frankfurt, Germany %8 2019-06 %G eng %R https://doi.org/10.1007/978-3-030-34356-9_37 %0 Conference Paper %B Practice and Experience in Advanced Research Computing (PEARC ’19) %D 2019 %T OpenDIEL: A Parallel Workflow Engine and DataAnalytics Framework %A Frank Betancourt %A Kwai Wong %A Efosa Asemota %A Quindell Marshall %A Daniel Nichols %A Stanimire Tomov %B Practice and Experience in Advanced Research Computing (PEARC ’19) %I ACM %C Chicago, IL %8 2019-07 %G eng %0 Generic %D 2018 %T Accelerating 2D FFT: Exploit GPU Tensor Cores through Mixed-Precision %A Xaiohe Cheng %A Anumeena Soma %A Eduardo D'Azevedo %A Kwai Wong %A Stanimire Tomov %I The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC18), ACM Student Research Poster %C Dallas, TX %8 2018-11 %G eng %0 Generic %D 2017 %T MagmaDNN – High-Performance Data Analytics for Manycore GPUs and CPUs %A Lucien Ng %A Kwai Wong %A Azzam Haidar %A Stanimire Tomov %A Jack Dongarra %I 2017 Summer Research Experiences for Undergraduate (REU), Presentation %C Knoxville, TN %8 2017-12 %G eng