ONNC Quantization to INT8 Experiment

onnx runtime (FP32) and quantization experiment on FPGA (INT8) TOP1/TOP5 DIFF

We run un-quantized models on ONNX runtime as the golden result and run our quantized models on FPGA as a test group. Figure 1 shows six of the seven models get precision lost less than 2%. Only VGG19 has higher precision lost and the precision diff is preserved in 3%.

Result

ONNX Experiment Environment

Dataset: All experiments are run with 12500 images using ILSVRC2012 dataset.

FPGA Experiment Environment

Dataset: All experiments are run with 12500 images using ILSVRC2012 dataset.

Experiment Method

TOP = predicted success times / total number of images

--

--

The Open Neural Network Compiler (ONNC), a compiler that connects Open Neural Network Exchange Format (ONNX) to every deep learning accelerator (DLA).

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
ONNC

ONNC

The Open Neural Network Compiler (ONNC), a compiler that connects Open Neural Network Exchange Format (ONNX) to every deep learning accelerator (DLA).