Albrecht, C.; Koch, R.; Pionteck, T. (Institute of Computer Engineering, University of Lübeck)
Glösekötter, P. (Fachbereich Elektrotechnik und Informatik, Fachhochschule Münster)
Today’s technology advances still lead to increased integration densities and higher clock rates. However, these advances are more and more accompanied by continuously increasing drawbacks such as intra-die-variation, temperature dependencies, and device degradation mechanisms. Besides the classic measure mips-per-watt, the overall system reliability is steadily becoming more important. In this context, device reliability has become a critical aspect of system-on-chip (SoC) design. To cope with this challenge, we divide the SoC itself into different architectural layers. Each layer is tailored individually to the specific SoC needs in terms of fault-tolerance. At the same time, we derive a comprehensive method of how to account for all layer dependencies in an efficient manner and yet enable error detection and correction mechanisms at system level. In particular, error detection is predominantly established at lower levels, whereas required error correction mechanisms are applied at higher system levels.