How to Speed up Embedded Multi-core Systems Using Locality Conscious Array Distribution for Loop Parallelization

Conference: ARCS 2016 - 29th International Conference on Architecture of Computing Systems
04/04/2016 - 04/07/2016 at Nürnberg, Deutschland

Proceedings: ARCS 2016

Pages: 5Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Hehenkamp, Niklas; Wagensveld, Remko van; Facchi, Christian; Margull, Ulrich (Technische Hochschule Ingolstadt, Esplanade 10, 85049 Ingolstadt, Germany)
Schoenwetter, Dominik; Fey, Dietmar (Chair of Computer Science 3 (Computer Architecture), Friedrich-Alexander-University, Erlangen-Nürnberg (FAU),Martensstr. 3, 91058 Erlangen, Germany)
Mader, Ralph (Continental Automotive GmbH, Siemensstr. 12, 93055 Regensburg, Germany)

Abstract:
Safety critical embedded systems in the automotive domain will increasingly depend on parallel processing architectures. Since concurrency has largely been exploited on task level, future applications will feature growing proportions of parallel implementations on data level such as loop parallelism. Efficient memory utilization becomes inevitable to overcome future challenges facing the memory wall in modern computing. This work suggests an approach for data locality conscious memory accessing and data distribution implementations in embedded multi-core systems. The results have been validated by a case study on a modern Engine Control Unit focusing on the parallelization of widely spread array intensive loops.