Three researchers earn MICRO Test of Time for groundbreaking timing speculation work

Todd Austin, David Blaauw, Trevor Mudge, and a group of alumni were recognized by ACM MICRO for their landmark 2003 paper.
Chip

A group of CSE and ECE researchers have been recognized with a MICRO Test of Time Award for their landmark 2003 paper, “Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation.” Co-authored by Profs. Todd Austin, David Blaauw, Trevor Mudge, their students, and ARM researchers, the paper introduced the important concept of timing speculation which allowed a system to reduce power by simply running until its circuits started to fail. This award recognizes the most influential papers published in prior sessions of the ACM International Symposium on Microarchitecture.

At the time of the project, increasing clock frequencies and silicon integration were making power aware computing a critical concern in the design of embedded processors and systems-on-chip. Designing a system that could effectively run at multiple voltage levels was a challenge, requiring the identification of certain critical voltage thresholds below which correct program execution by the processor suffered. This was compounded by a number of environmental and process related variabilities that impact circuit performance.

Voltage scaling techniques of the time handled this variability poorly, and proved to be extremely conservative as a result. Common options operated under worst-case scenario assumptions and reduced power usage by far less than was required.

With Razor, the authors proposed an alternative scaling technique based on dynamic detection and correction of circuit timing errors. Razor’s key idea was to tune the supply voltage by monitoring the error rate during circuit operation. Since this error detection provided a way to monitor the actual limit of voltage reduction, it automatically accounted for fabrication variability among different chips and their different operating environments, such as temperature. It therefore eliminated the need for voltage margins that were necessary for “always-correct” circuit operation in former designs. 

In addition, a key feature of Razor was that operation at sub-critical supply voltages did not constitute a catastrophic failure. Instead, these voltages represented a trade-off between the power penalty incurred from error correction against additional power savings obtained from operating at a lower supply voltage, and the microarchitecture itself could be used to recover from the timing failures.

The techniques proposed in their paper have allowed processors to operate at a voltage lower than the worst-case design and combat the increasing design guard bands.

“A key factor in the success of the Razor approach is that it made the idea to scale down the voltage till you start to fail a practical, feasible concept by performing the error detection and correction in a highly efficient manner that adds far less overhead than the what Razor gains from removing margins,” Blaauw says.

“This project really solidified my ‘rule-breaking’ approach to research,” Austin says. “Step one, find a rule no one breaks; step two, break that rule; step three, find out why no one ever breaks that rule – or, discover something amazing!”

The MICRO selection panel cites the paper’s impact on academic and industrial research, both “spurred by the dramatic energy efficiency made possible by Razor.” A number of companies launched detailed research and development efforts into dynamic error detection and correction for resilient systems and also published numerous follow-up product papers based on Razor.

“This foundational work is exemplary of the high impact possible from architecture and circuit cross-cutting research,” the committee wrote.

The paper was co-authored by: alumni Dan Ernst, Nam Sung Kim, Shidhartha Das, Sanjay Pant, Rajeev Rao, Toan Pham, and Conrad Ziesler; Profs. David Blaauw, Todd Austin, and Trevor Mudge; and ARM researcher and alum Krisztian Flautner.