An Architecture Framework for Porting Applications to FPGAs

Conference: ARCS 2014 - 27th International Conference on Architecture of Computing Systems
02/25/2014 - 02/28/2014 at Luebeck, Deutschland

Proceedings: ARCS 2014

Pages: 7Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Nowak, Fabian; Bromberger, Michael; Karl, Wolfgang (Karlsruhe Institute of Technology, Chair for Computer Architecture and Parallel Processing, 76128 Karlsruhe, Germany)

High-level language converters help creating FPGAbased accelerators and allow to rapidly come up with a working prototype. But the generated state machines do often not perform as optimal as hand-designed control units, and they require much area. Also, the created deep pipelines are not very efficient for small amounts of data. Our approach is an architecture framework of hand-coded building blocks (BBs). A microprogrammable control unit allows programming the BBs to perform computations in a data-flow style. We accelerate applications further by executing independent tasks in parallel on different BBs. Our microprogram implementation for the Conjugate-Gradient method on our data-driven, microprogrammable, task-parallel architecture framework on the Convey HC-1 is competitive with a 24-thread Intel Westmere system. It is 1:2x faster using only one out of four available FPGAs, thereby proving its potential for accelerating numerical applications. Moreover, we show that hardware developers can change the BBs and thereby reduce iteration count of a numerical algorithm like the Conjugate-Gradient method to less than 0:5x due to more precise operations inside the BBs, speeding up execution time 2:47x.