Keren Zhou
Keren Zhou
Home
Experience
Projects
Featured
Publications
Talks
Students
Tags
News
Light
Dark
Automatic
GPU
Triton
Triton is a language and compiler for writing highly efficient custom Deep-Learning primitives. The aim of Triton is to provide an open-source environment for expressing tensor math workloads that offers high flexibility, developer productivity and end to end performance.
Code
DOC
GVProf
We implemented GVProf, the first value profiler that locates value redundancy problems in applications running on GPU-based clusters. Our experiments show that GVProf incurs acceptable overhead and scales to large executions. GVProf provides useful insights to guide performance optimization. Under the guidance of GVProf, we optimized several HPC and machine learning workloads, obtaining speedups up to 1.93x.
Code
DOC
Hardware-Aware Compression with Random Operation Access Specific Tile (ROAST) Hashing
Advancements in deep learning are often associated with increasing model sizes. Training and deploying large models require …
Aditya Desai
,
Keren Zhou
,
Anshumali Shrivastava
Cite
Project
URL
Towards Agile Development of Efficient Deep Learning Operators (Hardware Insights)
Presented a talk about Triton and requested feedback from Intel engineers
Jun 29, 2023 10:56 PM — 10:56 PM
Virtual
Keren Zhou
Project
Slides
Towards Agile Development of Efficient Deep Learning Operators (Call for Contributions)
Presented a talk about Triton and called for contributions to improving the language
Jun 19, 2023 10:56 PM — 10:56 PM
Lake Tahoe, California
Keren Zhou
Project
Slides
DrGPUM: Guiding Memory Optimization for GPU-Accelerated Applications
GPUs are widely used in today’s computing platforms to accelerate applications in various domains. However, scarce GPU memory resources …
Mao Lin
,
Keren Zhou
,
Pengfei Su
Cite
Project
DOI
URL
Towards Agile Development of Efficient Deep Learning Operators (Pre-MLIR)
Presented triton programming language and its next step
Dec 2, 2022 10:03 PM — 10:03 PM
Virtual
Keren Zhou
Project
Slides
Practical Performance Optimization for Deep Learning Applications
Presented triton programming language and a deep learning profiler
May 18, 2022 10:02 PM — 10:02 PM
Virtual
Keren Zhou
Project
Project
Slides
ValueExpert: Exploring Value Patterns in GPU-accelerated Applications
Presented a talk about our value profiling tool at ASPLOS'22
Mar 2, 2022 12:00 AM — 12:00 AM
Virtual
Keren Zhou
Project
Slides
Accelerating High-order Stencils on GPUs
Finite-difference methods based on high-order stencils are commonly used for modeling of seismic wave propagation, weather forecasting, …
Ryuichi Sai
,
John Mellor-Crummey
,
Xiaozhu Meng
,
Keren Zhou
,
Mauricio Araya-Polo
,
Jie Meng
Cite
Project
DOI
URL
«
»
Cite
×