The conventional style of using network connectivity in bringing artificial intelligence models to improve performance and efficiency needs some modification to meet the demands from the embedded systems to the automobile industry. Before directly jumping to the role of AI inference at the edge, let us understand the difference between training and inference. Machine learning training refers to the process of building an algorithm with frameworks and datasets, while in the case of inference, it takes the trained machine learning algorithms to make a prediction.
By getting AI inference at the edge, there is a significant improvement in the performance along with the reduced time (inference time) and reducing the dependency on the network connectivity.
Machine learning or artificial intelligence inference can run in on the cloud as well as on a device (hardware). However, when there is a requirement for fast data processing and predictions of the outcome, AI inference at the cloud can increase the inference time creating delays in the system. For non-time critical applications, AI inference at the cloud can always do the job, but in a world full of IoT devices and applications that require fast processing, AI inference at the edge solves the problem. In AI inference at the edge, specialized models are made to run at the point of data capture, which is an electronic embedded device in this case.
Google Edge TPU is Google’s custom-built ASIC that is designed to run AI at the edge with a target for a specific kind of application. When we talk about TPUs, CPUs and GPUs, it is important to note that only TPU is an ASIC while the other two are not. Also, in TPUs, the ALUs are directly connected to each other without using memory. This means that there is a low latency in transferring information.
With the need and increasing requirements to deploy high-quality AI inference at the edge, there have been several prototyping and production products from Coral that come with integrated Google Edge TPU. This small ASIC is built for low-power devices that can execute state-of-the-art mobile vision models such as MobileNet V2 at almost 400 FPS, in a power-efficient manner. According to the manufacturer, an individual Edge TPU can perform 4 trillion operations per second (4 TOPS), while utilizing only 2 watts of power. More information on ASIC and the production products can be found on the manufacturer’s page.
Read more: WHAT IS AI INFERENCE AT THE EDGE?