Failure-Rate Analysis based on Microprocessor Trace Data

Conference: Zuverlässigkeit und Entwurf - 9. ITG/GMM/GI-Fachtagung
09/18/2017 - 09/20/2017 at Cottbus, Deutschland

Proceedings: Zuverlässigkeit und Entwurf

Pages: 6Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Zabel, Martin; Brinker, Matthias; Koehler, Steffen; Spallek, Rainer G. (Technische Universität Dresden, Dresden, Germany)

Abstract:
This paper evaluates the observability of failures using trace data from a microprocessor-based system to assess the failure rate due to single-event upsets. The trace data is collected during fault-injection campaigns in several granularities such as program flow trace only or program flow and load/store trace together. At first, the failure probability is computed from the number of traces (from faulty runs) which differ from the trace of the golden run. As these numbers may contain false-positives and false-negatives, the failure probability is calculated again based on the final result of the application. As a result, one of three scenarios shows the real failure probability whereas the other 2 give only an estimation. Especially, one scenario shows that the program counter trace alone is not sufficient for a failure-rate analysis due to many undetected failures. A good compromise between required trace bandwidth and measurement quality is to compare the program flow and load/store trace on-chip.