Recently, artificial intelligence (AI) applications have found numerous applications in several fields, from healthcare imaging, robotic, to autonomous vehicles. They commonly require a large amount of computational capability, which is usually achieved using cloud-based solutions relying on High Performance Computing (HPC) clusters. However, the fast growth of Internet of Things (IoT) applications has concurrently increased the quantity of devices and apps generating, gathering, and analyzing data at the edge. This has generated interest in employing AI computing at the edge: intelligence is now closer to the source of data creation.
What does “AI computing at the edge” mean?
Using AI at the edge implies the deployment of AI algorithms directly on devices or edge servers, eliminating the need for continuous server communication.
Pros: This decentralized method improves efficiency, security, privacy, and latency simultaneously. As an example, AI computations on self-driving cars must be instantaneous and cannot rely on cloud computing to overcome network latency or poor connectivity 🚖
It improves efficiency, security, privacy, and latency simultaneously
Cons: Despite the clear advantages of Edge AI, it must be said that devices running on the edge have resource, power, and computational constraints. The hardware constraints make it often difficult to deploy large AI models (such deep neural networks) on the edge device: the size of the AI model is too large, or the computational requirements are too high.
There exist techniques and tools that allow to easy the deployment of AI models on edge devices.
Figure 1: IoT and edge devices.
Image from: https://www.focus.it/tecnologia/digital-life/internet-of-things-internet-delle-cose
Tools to deploy AI at the edge
To implement AI on edge systems, several tools and frameworks are available; some are completely open-source, others have proprietary components.
In the following, some of the most popular are listed:
TensorFlow Lite: TensorFlow Lite is a lightweight version of the popular TensorFlow framework specifically designed for mobile and edge devices. It enables efficient deployment of machine learning models on edge devices with limited resources.
Edge TPU Compiler: Developed by Google, the Edge TPU Compiler is part of the Coral project. It converts TensorFlow Lite models into a format compatible with Google's Edge TPU (Tensor Processing Unit), designed for edge inference. This tool helps bringing AI applications from prototype to production. As an example of products, they propose USB accessories featuring the Edge TPU that brings ML inferencing to existing systems.
PyTorch Mobile: an end-to-end workflow from Training to Deployment of ML models on the edge for iOS and Android mobile devices. It is also available for Linux. It provides APIs that cover common preprocessing and integration tasks needed for incorporating ML in mobile applications. PyTorch Mobile can also target only mobile devices running Android, Linux and iOS, while TensorFlow Lite can also target embedded systems.
ONNX (Open Neural Network Exchange): ONNX is an open format for representing deep learning models, providing interoperability between different frameworks. ONNX Runtime allows you to deploy ONNX models on edge devices. ONNX seeks to provide a common language that any machine learning framework can use to describe its models. The first scenario is to facilitate the deployment of a machine learning model in production. An ONNX interpreter (or runtime) can be implemented and optimized specifically for this task in the environment in which it is deployed. With ONNX, a unique process can be built to deploy a model in production, regardless of the learning framework used to build the model.
OpenVINO (Open Visual Inference and Neural Network Optimization): OpenVINO, developed by Intel, provides a toolkit for deploying high-performance computer vision and deep learning inference on edge devices. It supports various neural network frameworks.
A tutorial on how to run different models using the Intel® Distribution of OpenVINO™ toolkit is available here.
Balena: Balena is a platform for deploying and managing edge applications on IoT devices. It simplifies the process of deploying AI models on a fleet of edge devices. It supports a variety of hardware platforms commonly used in edge computing. However, it is not fully open-source: it has both proprietary and open-source components.
The choice of tool depends on the specific requirements of the project, the type of targeted edge device, and the machine learning framework adopted.
References:
T. Sipola, J. Alatalo, T. Kokkonen and M. Rantonen, "Artificial Intelligence in the IoT Era: A Review of Edge AI Hardware and Software," 2022 31st Conference of Open Innovations Association (FRUCT), Helsinki, Finland, 2022, pp. 320-331, doi: 10.23919/FRUCT54823.2022.9770931.
Q. Liang, P. Shenoy and D. Irwin, "AI on the Edge: Characterizing AI-based IoT Applications Using Specialized Edge Architectures," 2020 IEEE International Symposium on Workload Characterization (IISWC), Beijing, China, 2020, pp. 145-156, doi: 10.1109/IISWC50251.2020.00023.
R. Riggio et al., "AI@EDGE: A Secure and Reusable Artificial Intelligence Platform for Edge Computing," 2021 Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit), Porto, Portugal, 2021, pp. 610-615, doi: 10.1109/EuCNC/6GSummit51104.2021.9482440.
Contributor:
Annachiara Ruospo
Assistant Professor (RTD-A)
Department of Control and Computer Engineering (DAUIN)
University: Politecnico di Torino, Italy
⏰ That's all for us, see you insAIde, next Tuesday, at 08:00.
Rocco Panetta , Federico Sartore , Vincenzo Tiani, LL.M. , Davide Montanaro , Gabriele Franco