Recovery in a distributed DBMS is more complicated than in a centralized DBMS for the following reasons:
New kinds of failure can arise, namely, failure of communication links and failure of a remote site at which a subtransaction is executing. Either all subtransactions of a given transaction must commit, or none must commit, and this property must be guaranteed despite any combination of site and link failures. This guarantee is achieved using a commit protocol.
As in a centralized DBMS, certain actions are carried out as part of normal execution in order to provide the necessary information to recover from failures. A log is maintained at each site, and in addition to the kinds of information maintained in a centralized DBMS, actions taken as part of the commit protocol are also logged. The most widely used commit protocol is called Two-Phase Commit (2PC).
Normal Execution and Commit Protocols
During normal execution, each
subtransaction are logged at the site where it executes.The transaction manager at the site where the transaction originated is called the coordinator for the transaction; transaction managers at sites where its subtransactions execute are called subordinates (with respect to the coordination of this transaction).
The Two-Phase Commit (2PC) protocol, in terms of the messages exchanged and the log records written. When the user decides to commit a transaction, the commit command is sent to the coordinator for the transaction. This initiates the 2PC protocol:
1. The coordinator sends a prepare message to each subordinate.
2. When a subordinate receives a prepare message, it decides whether to abort or commit its subtransaction. It force-writes an abort or prepare log record, and then sends a no or yes message to the coordinator.
3. If the coordinator receives yes messages from all subordinates, it force-writes a commit log record and then sends a commit message to all subordinates. If it receives even one no message, or does not receive any response from some subordinate for a specified time-out interval, it force-writes an abort log record, and then sends an abort message to all subordinates.
4. When a subordinate receives an abort message, it force-writes an abort log record, sends an ack message to the coordinator, and aborts the subtransaction. When a subordinate receives a commit message, it force-writes a commit log record, sends an ack
subtransaction.
5. After the coordinator has received ack messages from all subordinates, it writes an end log record for the transaction.
The name Two-Phase Commit reflects the fact that two rounds of messages are exchanged:
First a voting phase, then a termination phase, both initiated by the coordinator. The basic principle is that any of the transaction managers involved (including the coordinator) can unilaterally abort a transaction, whereas there must be unanimity to commit a transaction. When a message is sent in 2PC, it signals a decision by the sender. In order to ensure that this decision survives a crash at the sender's site, the log record describing the decision is always forced to stable storage before the message is sent.
A transaction is officially committed at the time the coordinator's commit log record reaches stable storage. Subsequent failures cannot aect the outcome of the transaction; it is irrevocably committed. Log records written to record the commit protocol actions contain the type of the record, the transaction id, and the identity of the coordinator. A coordinator's commit or abort log record also contains the identities of the subordinates.
Restart after a Failure
When a site comes back up after a crash, we invoke a recovery process that reads the log and processes all transactions that were executing the commit protocol at the time of the crash. The transaction manager at this site could have been the coordinator for some of these transactions
Following is the recovery process:
If we have a commit or abort log record for transaction T, its status is clear; we redo or undo T, respectively. If this site is the coordinator, which can be determined from the commit or abort log record, we must periodically resend| because there may be other link or site failures in the system|a commit or abort message to each subordinate until we receive an ack. After we have received acks from all subordinates, we write an end log record for T.
If we have a prepare log record for T but no commit or abort log record, this site is a subordinate, and the coordinator can be determined from the prepare record. We
determine the status of T. Once the coordinator responds with either commit or abort, we write a corresponding log record, redo or undo the transaction, and then write an end log record for T.
If we have no prepare, commit, or abort log record for transaction T, T certainly could not have voted to commit before the crash; so we can unilaterally abort and undo T and write an end log record. In this case we have no way to determine whether the current site is the coordinator or a subordinate for T. However, if this site is the coordinator, it might have sent a prepare message prior to the crash, and if so, other sites may have voted yes. If such a subordinate site contacts the recovery process at the current site, we now know that the current site is the coordinator for T, and given that there is no commit or abort log record, the response to the subordinate should be to abort T.