A survey of fault tolerance approaches on different architecture levels

Conference: ARCS 2017 - 30th International Conference on Architecture of Computing Systems
04/03/2017 - 04/06/2017 at Vienna, Austria

Proceedings: ARCS 2017

Pages: 9Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Osinski, Lukas; Langer, Tobias; Mottok, Juergen (Laboratory of Safe and Secure Systems - LaS3, University of Applied Sciences Regensburg, Germany)

Abstract:
In the recent years the development trends for computing platforms moved to multicore systems. Associated with this trend, feature sizes decreased with each new hardware generation and consequently led to a rise of transient and permanent error frequency in memory and CPUs. In this context, researchers presented several approaches which exploit the inherent redundancy of multicore platforms to provide fault tolerance. We present a discussion of fault tolerance approaches based on redundancy at different levels of architecture regarding their sphere of replication, performance as well as error detection and recovery capability.