6.1.1 Maintaining the High Water Mark
To maintain the HWM, we designate one site the
timestamp authority (TA) with the responsibility of
allocating timestamps to other sites. The idea is to divide
time into a number of epochs; we define the epoch number
to be the number of epochs that have elapsed since the
beginning of time. We anticipate epochs being relatively
long – e.g., many seconds each, but the exact duration
may vary from deployment to deployment. We define the
initial HWM to be epoch 0 and start current epoch at 1.
Periodically, the TA decides to move the system to the
next epoch; it sends a end of epoch message to each site,
each of which increments current epoch from e to e+1,
thus causing new transactions that arrive to be run with a
timestamp e+1. Each site waits for all the transactions
that began in epoch e (or an earlier epoch) to complete and
then sends an epoch complete message to the TA. Once
the TA has received epoch complete messages from all
sites for epoch e, it sets the HWM to be e, and sends this
value to each site. Figure 3 illustrates this process.
After the TA has broadcast the new HWM with value
e, read-only transactions can begin reading data from
epoch e or earlier and be assured that this data has been
committed. To allow users to refer to a particular realworld
time when their query should start, we maintain a
table mapping epoch numbers to times, and start the query
as of the epoch nearest to the user-specified time.
To avoid epoch numbers from growing without bound
and consuming extra space, we plan to “reclaim” epochs
that are no longer needed. We will do this by “wrapping”
timestamps, allowing us to reuse old epoch numbers as in