Keren Zhou
Keren Zhou
Home
Experience
Projects
Featured
Publications
Talks
Students
Tags
News
Light
Dark
Automatic
1
Hardware-Aware Compression with Random Operation Access Specific Tile (ROAST) Hashing
Advancements in deep learning are often associated with increasing model sizes. Training and deploying large models require …
Aditya Desai
,
Keren Zhou
,
Anshumali Shrivastava
Cite
Project
URL
DrGPUM: Guiding Memory Optimization for GPU-Accelerated Applications
GPUs are widely used in today’s computing platforms to accelerate applications in various domains. However, scarce GPU memory resources …
Mao Lin
,
Keren Zhou
,
Pengfei Su
Cite
Project
DOI
URL
Low Overhead and Context Sensitive Profiling of GPU-Accelerated Applications
As we near the end of Moore’s law scaling, the next-generation computing platforms are increasingly exploring heterogeneous …
Keren Zhou
,
Jonathon Anderson
,
Xiaozhu Meng
,
John Mellor-Crummey
Cite
Project
DOI
URL
ValueExpert: Exploring Value Patterns in GPU-Accelerated Applications
General-purpose GPUs have become common in modern computing systems to accelerate applications in many domains, including machine …
Keren Zhou
,
Yueming Hao
,
John Mellor-Crummey
,
Xiaozhu Meng
,
Xu Liu
Cite
Project
DOI
URL
GPA: A GPU Performance Advisor Based on Instruction Sampling
Developing efficient GPU kernels can be difficult because of the complexity of GPU architectures and programming models. Existing …
Keren Zhou
,
Xiaozhu Meng
,
Ryuichi Sai
,
John Mellor-Crummey
Cite
Project
DOI
URL
Measurement and Analysis of GPU-Accelerated OpenCL Computations on Intel GPUs
Graphics Processing Units (GPUs) have become a key technology for accelerating node performance in supercomputers, including the US …
Aaron Thomas Cherian
,
Keren Zhou
,
Dejan Grubisic
,
Xiaozhu Meng
,
John Mellor-Crummey
Cite
Project
DOI
URL
Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model
This paper reports on experiences gained and practices adopted when using the latest features of OpenMP to port a variety of HPC …
Barbara Chapman
,
Buu Pham
,
Charlene Yang
,
Christopher Daley
,
Colleen Bertoni
,
Dhruva Kulkarni
,
Dossay Oryspayev
,
Ed D'Azevedo
,
Johannes Doerfert
,
Keren Zhou
,
Kiran Ravikumar
,
Mark Gordon
,
Mauro Del Ben
,
Meifeng Lin
,
Melisa Alkan
,
Michael Kruse
,
Oscar Hernandez
,
P. K. Yeung
,
Paul Lin
,
Peng Xu
,
Swaroop Pophale
,
Tosaporn Sattasathuchana
,
Vivek Kale
,
William Huhn
,
Yun (Helen) He
Cite
Project
DOI
URL
A Tool for Top-down Performance Analysis of GPU-Accelerated Applications
To support performance measurement and analysis of GPU-accelerated applications, we extended the HPCToolkit performance tools with …
Keren Zhou
,
Mark Krentel
,
John Mellor-Crummey
Cite
Project
DOI
URL
GVPROF: A Value Profiler for GPU-Based Clusters
GPGPUs are widely used in high-performance computing systems to accelerate scientific and machine learning workloads. Developing …
Keren Zhou
,
Yueming Hao
,
John Mellor-Crummey
,
Xiaozhu Meng
,
Xu Liu
Cite
Project
DOI
URL
Tools for Top-down Performance Analysis of GPU-Accelerated Applications
This paper describes extensions to Rice University’s HPCToolkit performance tools to support measurement and analysis of …
Keren Zhou
,
Mark W. Krentel
,
John Mellor-Crummey
Cite
Project
DOI
URL
»
Cite
×