Python Simplified

PythonSimplifiedcomLogo

ONNX for Model Interoperability & Faster Inference

ONNX resized

Introduction

The chances are high that you might have heard about ONNX but not sure what it does and how to use it. Don’t worry, you are in right place. In this beginner-friendly article, you will understand about ONNX. Let’s dive-in. 

Assume that you built a deep learning model using the TensorFlow framework for face detection. But unfortunately, you might have to deploy this model in an environment that uses Pytorch. How can you handle this scenario? Well, we can think of two ways to handle this: 

  • convert the existing model to a standard format that can work in any environment. 
  • convert a model from one framework (TensorFlow) to another desired framework (Pytorch). 

This is exactly what ONNX does. Using the ONNX platform you can convert the TensorFlow model to ONNX (an open standard format for interoperability). Then use it for inference/prediction. Alternatively, you can also further convert this ONNX to Pytorch and use the Pytorch model for inference/prediction. 

Thanks to the coordinated efforts by partners such as Microsoft, Facebook, and AWS among others. Now, we can work seamlessly without worrying about the underlying framework for building the machine learning model. 

ONNX

ONNX stands for Open Neural Network Exchange. It can be used for mainly there different tasks –

  • Convert model from any framework to ONNX format
  • Convert ONNX format to any desired framework
  • Faster inference using ONNX model on supported runtimes

Though it says neural network in the definition it can be used for both deep learning and the traditional machine learning models. So, don’t be confused. 

As you can see below, ONNX supports most of the frameworks. So, if you are looking for Pytorch, MXNET, MATLAB, XGBoost, CatBoost models for ONNX conversion refer to the official tutorials.

ONNX-supported-frameworks

ONNX Runtime

The below diagram shows the list of all the runtimes ONNX supports. These runtime engines help in high-performance deep learning inference as these come with built-in optimizations. ONNX Runtime is also one of the supported runtime engines which we will be using in this article for inference.

ONNX-supported-runtimes

Now that you have got a general idea about ONNX and ONNX Runtime, let’s jump to the main topic of this article. In this article, we will work through Scikit-learn and TensorFlow models as examples by converting them to ONNX format and making inferences using ONNX Runtime

In the future articles, we will explore how to convert ONNX to any desired framework, optimizations, and transformer models, etc.

Scikit-learn to ONNX

Installation

Code

The below code is straightforward and self-explanatory. We first built the Sklearn model (Step 1), converted it to ONNX format (Step 2), and finally made predictions/inferences using ONNX (Step 3). As you can see, inference using the ONNX format is 6–7 times faster than the original Scikit-learn model. The results will be much impressive if you work with bigger datasets.

For more details on skl2onnx refer to this documentation.

TensorFlow to ONNX

Installation

Code

The below code is also self-explanatory. Even in this case, the inferences/predictions using ONNX is 6–7 times faster than the original TensorFlow model. As mentioned earlier, the results will be much impressive if you work with bigger datasets.

For more details on tf2onnx refer to this documentation.

Others tools

ONNX also provides two additional tools for Optimization and Visualization. Optimization will be covered in future articles. These two visualization tools NETRON or Visual DL that can be used to visualize any machine learning or deep learning model. 

The below screenshot shows the visualization of the TensorFlow deep learning model we created earlier using NETRON

ONNX-NETRON-output

Conclusion

In this article, you understood what is ONNX and how it will be beneficial to the developers. Then we worked through the examples for ONNX conversion and saw that inferences using ONNX Runtime are much faster than original frameworks. Finally, we visualized the machine learning model using NETRON visualizer. 

References

Share on facebook
Share on twitter
Share on linkedin
Share on whatsapp
Share on email
Chetan Ambi

Chetan Ambi

A Software Engineer & Team Lead with over 10+ years of IT experience, a Technical Blogger with a passion for cutting edge technology. Currently working in the field of Python, Machine Learning & Data Science. Chetan Ambi holds a Bachelor of Engineering Degree in Computer Science.
Scroll to Top