On the Challenges in Programming Mixed-Precision Deep Neural Networks (MAPL 2020)

Write a Blog >>

Mon 15 - Fri 19 June 2020

Who

Ruizhe Zhao, Wayne Luk, Chao Xiong, Xinyu Niu, Kuen Hung Tsoi

Track

MAPL 2020

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 16 Jun 2020 11:30 - 12:00 at MAPL live stream - Compilers for Deep Learning Frameworks Chair(s): Charles Sutton

Abstract

Recent work shows that Deep Neural Networks (DNNs) are resilient to reduced data precision, which encourages the practice of employing low-precision data format for more efficient computation, especially on custom hardware accelerators. Multiple low-precision data types can be mixed together to fit the dynamic range of different DNN layers. Unfortunately, these formats are not normally supported on frequently used processors and Deep Learning (DL) frameworks. Once we devise a novel data type, we should manually implement, optimize, and integrate it with several components of a DL framework, which is tedious and error-prone.

This paper first reviews three major challenges in integrating novel data types into DL frameworks, including generating high-performance arithmetic and typecast functions, reducing the recompilation time and bloated binary size caused by excessive template specialization, and optimizing DNN computational graph with mixed-precision. We present our approach, Lowgen, a framework that tentatively addresses these challenges correspondingly. For each challenge, we present our solution implemented and tested on our in-house, TensorFlow-like DL framework. Empirical evaluation shows that Lowgen can automatically generate efficient data type implementation that enables significant speed-up, which greatly lowers the development effort and enhances research efficiency on mixed-precision DNN.

Ruizhe Zhao

Imperial College London

United Kingdom

Wayne Luk

Imperial College London

Chao Xiong

Corerain Technologies

Xinyu Niu

Corerain Technologies

Kuen Hung Tsoi

Corerain Technologies

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 16 Jun
Displayed time zone: Pacific Time (US & Canada) change

11:30 - 12:30	Compilers for Deep Learning FrameworksMAPL at MAPL live stream Chair(s): Charles Sutton Google Research

11:30 30m Talk		On the Challenges in Programming Mixed-Precision Deep Neural Networks MAPL Ruizhe Zhao Imperial College London, Wayne Luk Imperial College London, Chao Xiong Corerain Technologies, Xinyu Niu Corerain Technologies, Kuen Hung Tsoi Corerain Technologies
12:00 30m Talk		Semi-static Type, Shape and Symbolic Shape Inference for Dynamic Computation Graphs MAPL Momoko Hattori The University of Tokyo, Shimpei Sawada Preferred Networks, Shinichiro Hamaji Preferred Networks, Masahiro Sakai Preferred Networks, Shunsuke Shimizu Preferred Networks