Heidelberg 2022 – wissenschaftliches Programm
Bereiche | Tage | Auswahl | Suche | Aktualisierungen | Downloads | Hilfe
AKPIK: Arbeitskreis Physik, moderne Informationstechnologie und Künstliche Intelligenz
AKPIK 1: Data Integration & Processing
AKPIK 1.8: Vortrag
Montag, 21. März 2022, 18:00–18:15, AKPIK-H13
Structured Sparsity for CNNs on Reconfigurable Hardware — •Hendrik Borras, Günther Schindler, and Holger Fröning — Institute of Computer Engineering; Heidelberg University; Heidelberg (Germany)
While Convolutional Neural Networks (CNNs) are gaining crucial importance for various applications, including modern analysis and trigger systems, their memory and compute requirements are increasing steadily and the requirements of many CNNs impose serious challenges for achieving high inference throughput and low latency on edge devices, situated close to an experiment. To improve the performance of CNNs on such resource-constrained devices, model compression through quantization and pruning has been proposed and evaluated as a possible solution in the past. Field Programmable Gate Arrays (FPGAs) are a prime example of low-power devices and suitable for a pervasive deployment. Here, FINN is one of the most widely used frameworks for deploying highly quantized CNN models on edge devices. In this work, we extend FINN for pruning by introducing two methods for column pruning, enabling further compression of CNN-based models. The two techniques vary in their granularity and implementation complexity. The coarse-grain method only prunes blocks of columns, while the fine-grained method is able to prune single columns. Both approaches are then evaluated on the CIFAR10 image classification task. We demonstrate significant throughput improvements of on average 83% while keeping the accuracy degradation within reasonable bounds (4.2%) at 50% sparsity.