Run-Time Adaptive Hardware Accelerator for Convolutional Neural Networks

Conference: SMACD / PRIME 2021 - International Conference on SMACD and 16th Conference on PRIME
07/19/2021 - 07/22/2021 at online

Proceedings: SMACD / PRIME 2021

Pages: 4Language: englishTyp: PDF

Authors:
Sestito, Cristian; Spagnolo, Fanny; Corsonello, Pasquale (Department of Informatics, Modeling, Electronics and System Engineering, University of Calabria, Arcavacata di Rende, Italy)
Perri, Stefania (Department of Mechanical, Energy and Management Engineering, University of Calabria, Arcavacata di Rende, Italy)

Abstract:
State-of-the-art Convolutional Neural Networks are characterized by heterogeneous convolutional layers to proper balance accuracy and computational complexity. Run-time adaptive convolution architectures able to process feature maps with kernels of various sizes and strides are highly desirable to achieve a favorable speed/power dissipation balance. This paper presents the design of an adaptive architecture able to manage efficiently convolutional layers with different running parameters. In order to guarantee high resources utilization for all the supported kernel sizes and strides, in contrast with existing competitors, the proposed design combines non-uniform basic blocks differently customized from each other. As a further nice characteristic, the hardware architecture here presented efficiently manages both odd and even kernel sizes, useful in models also requiring transposed convolutional layers. When accommodated within a Xilinx XC7Z045 FPGA SoC device, the proposed engine reaches a peak throughput of 217.2 GOPS and dissipates about 2.75 W at the 150 MHz clock frequency.