Figure 5 shows CWV implanted into write-back L1 data and L2 caches, which are protected by word granularity ECC. Since comparison for identifying unmodified bytes requires reading existing old data from the L1 data cache, we leverage an atomic cache operation called read-modifywrite (RMW) [14]. In ECC-protected caches, in order to generate ECC bits, every write needs a prior read operation if write data size is smaller than ECC word granularity. For example, a word is read first and merged with a byte data to generate ECC bits for the merged word. We utilize this read free-of-charge. This read is assumed to incur an energy overhead for a word write because the word write does not necessarily require RMW. However, this read does not require an extra latency because the read can be overlapped with tag matching of the write operation, after which a matching data way is written on a hit