ONNC Runtime Incorporated with Intel MKL Library

Summary: In this article, we describe how we leverage Intel® Math Kernel Library (Intel® MKL) and significantly improved ONNC runtime execution time.

Motivation

Intel has launched Intel® Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN), which makes good use of the Intel CPU instruction set to optimize Deep Neural Network operators. In this article, incorporating the core math functions from the MKL-DNN library into the ONNC runtime to speed up the calibrator operation.

Experiment

  1. Incorporated the MKL-DNN library in ONNC Runtime
  2. Call MKL-DNN in ONNC Runtime
  3. ONNC Runtime link to MKL-DNN library

Results

Figure 1 shows the ONNC Runtime with MKL-DNN compares with Native ONNC Runtime. The following figure shows the execution time comparison of ONNC Runtime with and without the MKL-DNN library on a machine with a 3.4 GHz Intel i7 Core and 64G DDR3 memory for the ONNX 12 models. Each experiment uses the ONNC runtime library to calibrate the scaling factors of INT8 inference with 100 images. The result shows on vgg19, after using MKL-DNN, the inference time is shortened by 20 times compared to the Original ONNC Runtime. In other words, calibration can be shortened from two hours to six minutes.

Figure 1 ONNC runtime vs. ONNC runtime with MKL-DNN x-axis: show execution time (seconds) y-axis: shows model

The following table shows the execution time comparison of ONNC Runtime with and without the MKL-DNN library on a machine with a 3.4 GHz Intel i7 Core and 64G DDR3 memory for the ONNX 12 models. Each experiment uses the ONNC runtime library to calibrate the scaling factors of INT8 inference with 100 images.

Table 1 ONNC runtime vs. ONNC runtime with MKL-DNN (m: minutes/ s: seconds)

Conclusion

The Open Neural Network Compiler (ONNC), a compiler that connects Open Neural Network Exchange Format (ONNX) to every deep learning accelerator (DLA).