HPCToolkit
Publications
Low Overhead and Context Sensitive Profiling of GPU-Accelerated Applications
As we near the end of Moore’s law scaling, the next-generation computing platforms are increasingly exploring heterogeneous …
Measurement and Analysis of GPU-Accelerated OpenCL Computations on Intel GPUs
Graphics Processing Units (GPUs) have become a key technology for accelerating node performance in supercomputers, including the US …
Measurement and analysis of GPU-accelerated applications with HPCToolkit
To address the challenge of performance analysis on the US DOE’s forthcoming exascale supercomputers, Rice University has been …
Tools for Top-down Performance Analysis of GPU-Accelerated Applications
This paper describes extensions to Rice University’s HPCToolkit performance tools to support measurement and analysis of …
Talks
Practical Performance Optimization for Deep Learning Applications
Presented triton programming language and a deep learning profiler
Performance Measurement, Analysis, and Optimization of GPU-accelerated Applications
Presented a poster and a talk about my PhD research at SC’21
Tools for Top-down Performance Analysis of GPU-Accelerated Applications
Presented our ICS’20 work.
A Tool for Top-down Performance Analysis of GPU-accelerated Applications
Presented a poster and a short talk about HPCToolkit’s GPU support at PPoPP’20
Optimizing GPU-accelerated Applications with HPCToolkit
Presented the prototype of HPCToolkit’s GPU support at PETASCALE’19
A Tool for Performance Analysis of GPU-accelerated Applications
Presented the prototype of HPCToolkit’s GPU support at CGO’19