Embedded Learning
Emerging edge intelligence applications motivate us to deploy deep learning models on resource-constrained embedded systems. However, state-of-the-art deep neural networks (DNN) often require intensive resources from computation, memory, and energy. Great progress has been made in reducing the resource requirements for embedded inference through compression techniques, e.g., quantization, pruning, etc. As a result, these methods have found many applications, e.g. in natural language processing, image recognition and automatic control.
Equally important, but much less developed, is the area of embedded learning and adaptation. Especially, when facing data privacy issues and/or limited communication bandwidth, the models need to be retrained or updated on these resource-constrained embedded devices. We are exploring novel algorithms and implementations that take into account the stringent computational, memory and communication constraints.
Among network compression techniques, int8 quantized network is a popular choice for off-the-shelf microcontroller platforms, since it achieves a good trade-off between accuracy and compression ratio, as well as hardware efficiency. We thus provide an example code for on-device training on ultra low-power 32-bit ARM cortex microcontrollers, which targets learning the last layer of an int8-quantized DNN and re-quantizing the trained last layer into int8 format.