Floating Point Units Efficiency in Multi-Core Processors

Conference: ARCS 2015 - 28th International Conference on Architecture of Computing Systems
03/24/2015 - 03/27/2015 at Porto, Portugal

Proceedings: ARCS 2015

Pages: 8Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Aminot, Alexandre; Lhuiller, Yves; Castagnetti, Andrea (CEA, LIST, F-91191 Gif-sur-Yvette, France)
Charles, Henri-Pierre (Univ. Grenoble Alpes, F-38000 Grenoble, France)
Charles, Henri-Pierre (CEA, LIST, Minatec campus, F-38054 Grenoble, France)

Abstract:
Symmetric multi-core processors (SMP) offer a fair level of performance/energy by exploiting instruction and thread level parallelisms. Further speed-up can be obtained by using extensions (e.g. SIMD, FPU). However, extensions consume power and area, and need to be replicated in order to maintain ISA homogeneity. In this paper, we question how to efficiently exploit floating point units and whether every core needs to contain one. We investigate the power and performance impact of the power gating and a functionally asymmetric multicore processors (FAMP) design, where the floating point units are implemented only in some of the cores. Without modifying the workload code and in order to optimize energy consumption according to extension requirements, we explore three energy management systems acting at different granularities: application level (iterative compilation), scheduler event level (core-switching) and hardware level (power-gating). The experimental results on an ARM-based model show that acting at application and scheduler levels we obtain an area and energy reduction (0.1% to 20%, avg 5.4%) compared to full-extended SMP with negligible performance losses. For some applications, with well defined phases, a scheduler level solution on a FAMP avoid a complex and area costly power-gating solution applied to the extensions.