Online Transient Error Detection and Recovery in Re-order Buffers of Superscalar Processors

Conference: Zuverlässigkeit und Entwurf - 4. GMM/GI/ITG-Fachtagung
09/13/2010 - 09/15/2010 at Wildbad Kreuth, Germany

Proceedings: Zuverlässigkeit und Entwurf

Pages: 8Language: englishTyp: PDF

Shazli, Syed (Northeastern University, Boston, MA, USA)
Tahoori, Mehdi (Karlsruhe Institute of Technology, Karlsruhe, Germany)

Transient errors are a major reliability barrier for modern processors. The vulnerability of processor cores to such errors grows exponentially with technology scaling. To meet reliability constraints in a cost-effective way, it is critical to localize the effects of these errors and prevent them from propagating to other parts of the system. In this work, we look at state-of-the-art superscalar processors that use deep pipelines, speculative execution and can issue multiple instructions per cycle. We present a methodology to provide low-cost detection and recovery of transient errors occurring in Reorder buffers used in these processors. The technique has been implemented on a cycle accurate, architectural simulator. Using the approach, we are able to detect and recover from all single bit upsets with less than 5% increase in CPI and 3% area overhead for ROB.