Enabling machine learning on the edge using SRAM conserving efficient neural networks execution approach
Breslin, John G.
Ali, Muhammad Intizar
MetadataShow full item record
This item's downloads: 26 (view details)
Sudharsan, Bharath, Patel, Pankesh, Breslin, John G., & Ali, Muhammad Intizar. (2021). Enabling machine learning on the edge using SRAM conserving efficient neural networks execution approach. Paper presented at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), Bilbao, Spain, Virtual, 13-17 September. DOI: 10.13025/azew-5w09
Edge analytics refers to the application of data analytics and Machine Learning (ML) algorithms on IoT devices. The concept of edge analytics is gaining popularity due to its ability to perform AI-based analytics at the device level, enabling autonomous decision-making, without depending on the cloud. However, the majority of Internet of Things (IoT) devices are embedded systems with a low-cost microcontroller unit (MCU) or a small CPU as its brain, which often are incapable of handling complex ML algorithms. In this paper, we propose an approach for the efficient execution of already deeply compressed, large neural networks (NNs) on tiny IoT devices. After optimizing NNs using state-of-the-art deep model compression methods, when the resultant models are executed by MCUs or small CPUs using the model execution sequence produced by our approach, higher levels of conserved SRAM can be achieved. During the evaluation for nine popular models, when comparing the default NN execution sequence with the sequence produced by our approach, we found that 1.61-38.06% less SRAM was used to produce inference results, the inference time was reduced by 0.28-4.9 ms, and energy consumption was reduced by 4-84 mJ. Despite achieving such high conserved levels of SRAM, our meth