On the Challenges in Programming Mixed-Precision Deep Neural Networks
Recent work shows that Deep Neural Networks (DNNs) are resilient to reduced data precision, which encourages the practice of employing low-precision data format for more efficient computation, especially on custom hardware accelerators. Multiple low-precision data types can be mixed together to fit the dynamic range of different DNN layers. Unfortunately, these formats are not normally supported on frequently used processors and Deep Learning (DL) frameworks. Once we devise a novel data type, we should manually implement, optimize, and integrate it with several components of a DL framework, which is tedious and error-prone.
This paper first reviews three major challenges in integrating novel data types into DL frameworks, including generating high-performance arithmetic and typecast functions, reducing the recompilation time and bloated binary size caused by excessive template specialization, and optimizing DNN computational graph with mixed-precision. We present our approach, Lowgen, a framework that tentatively addresses these challenges correspondingly. For each challenge, we present our solution implemented and tested on our in-house, TensorFlow-like DL framework. Empirical evaluation shows that Lowgen can automatically generate efficient data type implementation that enables significant speed-up, which greatly lowers the development effort and enhances research efficiency on mixed-precision DNN.
Tue 16 JunDisplayed time zone: Pacific Time (US & Canada) change
11:30 - 12:30
|On the Challenges in Programming Mixed-Precision Deep Neural Networks|
|Semi-static Type, Shape and Symbolic Shape Inference for Dynamic Computation Graphs|