Keren Zhou
Keren Zhou
Home
Experience
Projects
Featured
Publications
Talks
Students
Tags
News
Light
Dark
Automatic
Deep Learning
PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation
This paper introduces two extensions to the popular PyTorch machine learning framework, TorchDynamo and TorchInductor, which implement …
Jason Ansel
,
Edward Yang
,
Horace He
,
Natalia Gimelshein
,
Animesh Jain
,
Michael Voznesensky
,
Bin Bao
,
Peter Bell
,
David Berard
,
Evgeni Burovski
,
Geeta Chauhan
,
Anjali Chourdia
,
Will Constable
,
Alban Desmaison
,
Zachary DeVito
,
Elias Ellison
,
Will Feng
,
Jiong Gong
,
Michael Gschwind
,
Brian Hirsh
,
Sherlock Huang
,
Kshiteej Kalambarkar
,
Laurent Kirsch
,
Michael Lazos
,
Mario Lezcano
,
Yanbo Liang
,
Jason Liang
,
Yinghai Lu
,
C. K. Luk
,
Bert Maher
,
Yunjie Pan
,
Christian Puhrsch
,
Matthias Reso
,
Mark Saroufim
,
Marcos Yukio Siraichi
,
Helen Suk
,
Shunting Zhang
,
Michael Suo
,
Phil Tillet
,
Xu Zhao
,
Eikan Wang
,
Keren Zhou
,
Richard Zou
,
Xiaodong Wang
,
Ajit Mathews
,
William Wen
,
Gregory Chanan
,
Peng Wu
,
Soumith Chintala
Cite
Project
DOI
URL
Technical Review on PyTorch 2.0 and Triton
High-level overview of PyTorch 2.0 and Triton integration
Aug 7, 2023 10:03 PM — 10:03 PM
Virtual
Keren Zhou
Project
Slides
Hardware-Aware Compression with Random Operation Access Specific Tile (ROAST) Hashing
Advancements in deep learning are often associated with increasing model sizes. Training and deploying large models require …
Aditya Desai
,
Keren Zhou
,
Anshumali Shrivastava
Cite
Project
URL
Towards Agile Development of Efficient Deep Learning Operators (Hardware Insights)
Presented a talk about Triton and requested feedback from Intel engineers
Jun 29, 2023 10:56 PM — 10:56 PM
Virtual
Keren Zhou
Project
Slides
Towards Agile Development of Efficient Deep Learning Operators (Call for Contributions)
Presented a talk about Triton and called for contributions to improving the language
Jun 19, 2023 10:56 PM — 10:56 PM
Lake Tahoe, California
Keren Zhou
Project
Slides
Paw-Net: Stacking Ensemble Deep Learning for Segmenting Scanning Electron Microscopy Images of Fine-grained Shale Samples
Segmentation of scanning electron microscopy (SEM) images is critical yet time-consuming for geological analyses, as it needs to …
Binqian Yin
,
Qinhong Hu
,
Yingying Zhu
,
Chen Zhao
,
Keren Zhou
Cite
DOI
URL
A Performance Analysis Framework for Exploiting GPU Microarchitectural Capability
Presented our work on static performance analysis for GPUs at ICS17
Jul 20, 2017 9:36 PM — 9:36 PM
Chicago, IL, USA
Keren Zhou
Slides
Deep Learning on Modern Architectures
Discussed how state-of-the-art deep learning libraries optimize computations by utilizing architectural features.
Apr 1, 2017 10:00 PM — 10:00 PM
Institute of Computing Technology, Chinese Academy of Sciences
Keren Zhou
Slides
Convolution Methods
Introduced various kinds of convolution methods and analyzed their complexities, memory consumptions, and data access patterns.
Jun 1, 2016 9:58 PM — 9:58 PM
Institute of Computing Technology, Chinese Academy of Sciences
Keren Zhou
Slides
Cite
×