Understanding the GPU microarchitecture to achieve bare-metal performance tuning

Publication
ACM SIGPLAN Notices (PPoPP’17)