We tested the error detection capabilities of DVMC
by injecting errors into all components related to the
memory system: the load/store queue (LSQ), write
buffer, caches, interconnect switches and links, and
memory and cache controllers. The injected errors
included data and address bit flips; dropped, reordered,
mis-routed, and duplicated messages; and reorderings
and incorrect forwarding in the LSQ and write buffer.
For each test, an error time, error type, and error location were chosen at random for injection into a running
benchmark. After injecting the error, the simulation continued until the error was detected. Since errors become
non-recoverable once the last checkpoint taken before
the error expires, we also checked that a valid checkpoint was still available at the time of detection. We conducted these experiments for all four supported
consistency models with both the directory and snooping systems. DVMC detected all injected errors well within the SafetyNet recovery time frame of about 100k
processor cycles.