HPC

ValueExpert: exploring value patterns in GPU-accelerated applications

General-purpose GPUs have become common in modern computing systems to accelerate applications in many domains, including machine learning, high-performance computing, and autonomous driving. However, inefficiencies abound in GPU-accelerated …

Measurement and Analysis of GPU-Accelerated OpenCL Computations on Intel GPUs

Graphics Processing Units (GPUs) have become a key technology for accelerating node performance in supercomputers, including the US Department of Energy’s forthcoming exascale systems. Since the execution model for GPUs differs from that for …

Measurement and analysis of GPU-accelerated applications with HPCToolkit

To address the challenge of performance analysis on the US DOE’s forthcoming exascale supercomputers, Rice University has been extending its HPCToolkit performance tools to support measurement and analysis of GPU-accelerated applications. To help …

Accelerating High-Order Stencils on GPUs

Finite-difference methods based on high-order stencils are commonly used for modeling of seismic wave propagation, weather forecasting, computational fluid dynamics, convolutional neural networks, and others. Nowadays, the community commonly employs …

An Automated Tool for Analysis and Tuning of GPU-accelerated Code in HPC Applications

The US Department of Energys fastest supercomputers and forthcoming exascale systems employ Graphics Processing Units (GPUs) to increase the computational performance of compute nodes. However, the complexity of GPU architectures makes tailoring …

Analyzing GPU-accelerated Applications Using HPCToolkit

Using HPCToolkit to Measure and Analyze the Performance of GPU-accelerated Applications Tutorial, Mar-Apr 2021

GPA: A GPU Performance Advisor Based on Instruction Sampling

Developing efficient GPU kernels can be difficult because of the complexity of GPU architectures and programming models. Existing performance tools only provide coarse-grained tuning advice at the kernel level, if any. In this paper, we describe GPA, …

GVProf: A Value Profiler for GPU-Based Clusters

GPGPUs are widely used in high-performance computing systems to accelerate scientific and machine learning workloads. Developing efficient GPU kernels is critically important to obtain bare-metal performance on GPU-based clusters. In this paper, we …

Tools for Top-down Performance Analysis of GPU-Accelerated Applications

This paper describes extensions to Rice University's HPCToolkit performance tools to support measurement and analysis of GPU-accelerated applications. To help developers understand the performance of accelerated applications as a whole, HPCToolkit's …