How to Speed up Embedded Multi-core Systems Using Locality Conscious Array Distribution for Loop Parallelization

Konferenz: ARCS 2016 - 29th International Conference on Architecture of Computing Systems
04.04.2016 - 07.04.2016 in Nürnberg, Deutschland

Tagungsband: ARCS 2016

Seiten: 5Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Autoren:
Hehenkamp, Niklas; Wagensveld, Remko van; Facchi, Christian; Margull, Ulrich (Technische Hochschule Ingolstadt, Esplanade 10, 85049 Ingolstadt, Germany)
Schoenwetter, Dominik; Fey, Dietmar (Chair of Computer Science 3 (Computer Architecture), Friedrich-Alexander-University, Erlangen-Nürnberg (FAU),Martensstr. 3, 91058 Erlangen, Germany)
Mader, Ralph (Continental Automotive GmbH, Siemensstr. 12, 93055 Regensburg, Germany)

Inhalt:
Safety critical embedded systems in the automotive domain will increasingly depend on parallel processing architectures. Since concurrency has largely been exploited on task level, future applications will feature growing proportions of parallel implementations on data level such as loop parallelism. Efficient memory utilization becomes inevitable to overcome future challenges facing the memory wall in modern computing. This work suggests an approach for data locality conscious memory accessing and data distribution implementations in embedded multi-core systems. The results have been validated by a case study on a modern Engine Control Unit focusing on the parallelization of widely spread array intensive loops.