Accelerating Design Convergence of Automata Processing Designs with a Tiled Hierarchy
Conference: FSP Workshop 2019 - Sixth International Workshop on FPGAs for Software Programmers
09/12/2019 at Barcelona, Spain
Pages: 8Language: englishTyp: PDFPersonal VDE Members are entitled to a 10% discount on this title
Tracy II, Tommy; Skadron, Kevin; Stan, Mircea (University of Virginia, Charlottesville, VA USA)
Wadden, Jack (University of Michigan, Ann Harbor, MI USA)
Xie, Ted (University of Virginia, Charlottesville, VA USA; now at Google, Mountain View, CA USA)
Automata Processing is a parallel processing technique used to compute massive pattern matching queries on an input stream of data; among other applications, it is a popular approach to computing regular expressions. Automata processing with FPGAs achieves high performance by spatially representing these automata and distributing the input stream across all state machines to run in parallel. Existing automata-to-FPGA tools emit one, large, flat design lacking any hierarchy, where all automata sub-components make up one flat RTL file. We found that FPGA toolchains are poorly designed for large, flat designs and result in high synthesis times and poor Quality of Results. This limits the practical usefulness of current automata-on-FPGA approaches. We propose a technique to improve automata-to-FPGA mapping by automatically imposing a design hierarchy without changing RTL functionality, preventing the over-optimization in synthesis that led to excessively long compiler runtimes. We experimentally evaluate our technique by implementing an accelerated Learn to Rank machine learning automata application on FPGAs in Amazon’s F1 cloud. We demonstrate successful place-and-route at full frequency of previously unsynthesizable designs and greatly reduced synthesis time (up to 50%) at a cost of a few percentage points more logic resources.