Keren Zhou
Keren Zhou
Home
Experience
Lab
Projects
Featured
Publications
Talks
Students
Tags
News
Light
Dark
Automatic
1
Linear Layouts: Robust Code Generation of Efficient Tensor Computation Using F_2
Keren Zhou
,
Mario Lezcano
,
Adam Goucher
,
Akhmed Rakhmati
,
Jeff Niu
,
Justin Lebar
,
Pawel Szczerbuk
,
Peter Bell
,
Phil Tillet
,
Thomas Raoux
,
Zahi Moudallal
Cite
Project
arXiv
Triton-Sanitizer: A Fast and Device-Agnostic Memory Sanitizer for Triton with Rich Diagnostic Context
Hao Wu
,
Qidong Zhao
,
Songqing Chen
,
Yang Chen
,
Yueming Hao
,
Tony C. W. Liu
,
Sijia Chen
,
Adnan Aziz
,
Keren Zhou
Cite
Project
URL
PASTA: A Modular Program Analysis Tool Framework for Accelerators
Mao Lin
,
Hyeran Jeon
,
Keren Zhou
Cite
URL
Proton: Towards Multi-level, Adaptive Profiling for Triton
Keren Zhou
,
Tianle Zhong
,
Hao Wu
,
Jihyeong Lee
,
Yue Guan
,
Yufei Ding
,
Corbin Robeck
,
Yuanwei Fang
,
Jeff Niu
,
Philippe Tillet
Cite
Project
URL
Mercury: Unlocking Multi-GPU Operator Optimization for LLMs via Remote Memory Scheduling
Remote memory scheduling framework that optimizes LLM operators across multi-GPU deployments.
Yue Guan
,
Xinwei Qiang
,
Zaifeng Pan
,
Daniels Johnson
,
Yuanwei Fang
,
Keren Zhou
,
Yuke Wang
,
Wanlu Li
,
Yufei Ding
,
Adnan Aziz
Cite
DOI
PDF
Comprehensive Evaluation of LLMs in HPC Code Performance Optimization
Benchmarks and evaluates large language models for optimizing high-performance computing code.
Bowen Cui
,
Tejas Ramesh
,
Oscar Hernandez
,
Keren Zhou
Cite
arXiv
KPerfIR: Towards an Open and Compiler-centric Ecosystem for GPU Kernel Performance Tooling on Modern AI Workloads
An open, compiler-focused infrastructure for profiling and optimizing GPU kernels on AI workloads.
Yue Guan
,
Yuanwei Fang
,
Keren Zhou
,
Corbin Robeck
,
Manman Ren
,
Zhongkai Yu
,
Yufei Ding
,
Adnan Aziz
Cite
DOI
PDF
arXiv
DeepContext: A Context-aware, Cross-platform, and Cross-framework Tool for Performance Profiling and Analysis of Deep Learning Workloads
Performance profiling toolkit that unifies deep learning workload analysis across platforms and frameworks.
Qidong Zhao
,
Hao Wu
,
Yuming Hao
,
Zilingfeng Ye
,
Jiajia Li
,
Xu Liu
,
Keren Zhou
Cite
DOI
arXiv
Triton-Viz: Visualizing GPU Programming in AI Courses
GPU programming is a critical component in AI system courses, which is notoriously difficult to learn and teach, given its unique …
Tejas Ramesh
,
Alexander Rush
,
Xu Liu
,
Binqian Yin
,
Keren Zhou
,
Shuyin Jiao
Cite
Project
URL
SS1: Accelerating Inference with Fast and Expressive Sketch Structured Transform
Tensor multiplication with learned weight matrices is the fundamental building block in deep learning models. These matrices can often …
Aditya Desai
,
Kimia Saedi
,
Apoorv Walia
,
Jihyeong Lee
,
Keren Zhou
,
Anshumali Shrivastava
Cite
Project
»
Cite
×