Você está na página 1de 28

Oracle & Distributed Databases

Unit 8

Unit 8
Structure: 8.1

Distributed Transaction Management

Introduction. Objectives

8.2

Frame work for transaction management 8.2.1 8.2.2 8.2.3 Transactions properties Transaction Management Goals Distributed Transactions

8.3

Atomicity of Distributed Transactions 8.3.1 8.3.2 Recovery in Centralized Systems Problems with respect to Communication in Distributed databases 8.3.3 8.3.4 Recovery of Distributed Transactions The Transaction control

Self Assessment Questions 8.4 Summary 8.5 Terminal Questions 8.6 Answers to Self Assessment Questions

8.1 Introduction
The management of distributed transaction means dealing with interrelated problem like reliability, concurrency control and the efficient utilization of the resources of the complete system. In this unit we have considered the wellknown protocols like 2-Phase commit protocol for recovery and 2-Phase locking for concurrency control. All the aspects are discussed under different sections as given in the above structure.

Sikkim Manipal University

Page No. 147

Oracle & Distributed Databases

Unit 8

Objectives: At the end of Unit 8 learners are able to: Describe the different problems in managing the distributed transactions. Discuss about recovery of distributed transactions. Explain a popular algorithm called 2 Phase Commit Protocol

8.2 Framework for Transaction Management


In this case we define the properties of transactions, state the goals of distributed transaction management and describe architecture of distributed transaction. 8.2.1 Transaction's Properties The Transaction is an application or part of application that is characterized by the following properties. Atomicity: Either all or none of the transactions operations are performed. It requires that if a transaction is interrupted by a failure its partial results are not at all taken into consideration and the whole operation has to be repeated. The two types of problems that do not allow the transaction to complete are: o Transaction aborts: This may be requested by the transaction itself as some of its inputs are wrong or it has been estimated that the results produced may become useless. It also may be forced by the system for its own reason. The activity of ensuring atomicity in the presence of Transaction aborts is called Transaction recovery. o System Crashes: It is because of some catastrophic effects that crash the system without any prior knowledge. The activity of ensuring atomicity in the presence of system crashes is called crash recovery.

Sikkim Manipal University

Page No. 148

Oracle & Distributed Databases

Unit 8

The completion of transaction is called Commit. The primitives that can be used for carrying out the transaction are: Begin _Transaction Begin _Transaction Begin _Transaction

Commit

Abort

System Forces Abort

Durability: Once a transaction is committed, the system must guarantee that the results of operations will never be lost, independent of subsequent failures. The activity of providing Durability of the transaction is called Database recovery. Serializability: If many transactions execute concurrently, the result must be same as if they were executed serially in the same order. The activity of providing Serializability of the transaction is called Concurrency control. Isolation: This property states that an incomplete transaction cannot disclose its result to other transactions until it is committed. This property has to be strictly followed to avoid a problem called Cascading Aborts (Domino Effect). According to this all the transactions that have observed the partial results have to be aborted. 8.2.2 Transaction Management Goals The goal of the transaction management in a Distributed database is to control the execution of transactions so that: 1. Transactions have atomicity, durability, serializability, and isolation properties. 2. Their cost in terms of main memory, CPU, and number of transmitted control messages and their response time are minimized. 3. The availability of the system is maximized.

Sikkim Manipal University

Page No. 149

Oracle & Distributed Databases

Unit 8

CPU and main memory utilization: It is a common aspect in both centralized and distributed database. In case of concurrent transactions the CPU and main memory should be properly scheduled and managed by the operating system. Otherwise it becomes a bottleneck when the number concurrent transactions are more. Control messages and their Response time: As the control message does not carry any fruitful data and only they are used to control the execution of transactions, there should be very less exchange of such messages between the sites. The obvious reason is the communication cost will be increased unnecessarily. Another important aspect is the response time of each individual transaction. This should be as small as possible for the better performance of the system. Definitely it will be very crucial as in distributed system an additional time is required for communication between different sites. After studying the goals of the transaction management in detail, in the coming section we will suggest an appropriate model for Distributed transaction. 8.2.3 Distributed Transactions A transaction is a part of the application. One application issues a begin_transaction primitive; from this point onwards, all actions which are performed by the application, until a commit or abort primitive is issued are to be considered as one compete transaction. Let us study some related terminologies of this model. Agents: An agent is a local process, which performs several functions on behalf of an application. In order to cooperate in the execution of global operation required by the application the agents have to communicate. As they are resident at different sites, the communication between the agents is performed through messages.
Sikkim Manipal University Page No. 150

Oracle & Distributed Databases

Unit 8

Root agent: There exists a root agent, which starts the whole transaction, so that when the user requests the execution of an application, the root agent is started; the site of the root agent is called Site of Origin of the transaction.

The root agent has the responsibility of issuing the begin_transaction, commit and abort primitives.

Only the root agent can request the creation of a new agent.

Self Assessment Questions 8.2 1. The activity of ensuring atomicity in the presence of Transaction aborts is called . 2. The completion of transaction is called . 3. The activity of providing Durability of the transaction is called .

8.3 Atomicity of Distributed Transactions


In this section we are trying to experience the concept of atomicity, which is a required property for distributed transactions. In the first section we discuss recovery techniques in centralized system. Next we shall deal the possible communication failures in distributed transactions. After this let us concentrate on recovery procedures followed in distributed transactions. Finally the section talks about a distributed transaction algorithm called 2Phase commit Protocol, which actually take care all the properties expected by distributed transactions. 8.3.1 Recovery in Centralized Systems The issue of recovery mechanism is important in the case of revoking the system to perform normal database operations after a failure. Thus, before discussing about recovery, let us first analyze about the different kinds of failure that can occur in a centralized database.

Sikkim Manipal University

Page No. 151

Oracle & Distributed Databases

Unit 8

Failures in Centralized Systems: Failures are classified as follows: Failures without loss of information: In these failures, all the information stored in memory, is available for the recovery. The example for this type of failures is the abort of transactions because an error condition is discovered, like overflow. Failures with loss of volatile storage: Here the content of memory is lost. Of course the information recorded in the disks is not affected. Examples are system crashes. Failures with loss of Non volatile storage: such failures are called as Media Failures Here the contents of disks are also lost. Typical failures are Head crashes. The probabilities of such failures are very less compare to the other two types. Failures with loss of stable storage: In this type, some information stored in stable storage is lost because of many, simultaneous failure of the storage disks. Any way this probability cannot be reduced to zero. Logs: This is a basic technique to handle transactions in presence of failures. A Log contains information for undoing or redoing all the actions performed by the transactions. i.e To undo the action means, to cancel the performed operation and restore back the result of just previous operation. The necessity of undoing the actions of a transaction which fails before the commitment is that, if the commitment is not possible the database must remain the same as if the transaction were not executed at all; hence partial actions must be undone. To redo the action means, to perform again the action. To know the necessity of this we have to take the case of failure with the loss of volatile storage such that, the complete backup in stable storage of

Sikkim Manipal University

Page No. 152

Oracle & Distributed Databases

Unit 8

already committed operation has not been taken. In this case redoing of the actions of committed transactions is a must to record the database. A log record contains the required information for redoing or undoing actions. Whenever a transaction is performed on the database, a log record is written in the log file. The log record includes: The identifier of the transaction The identifier of the record The type of action The old record value The new record value Auxiliary information for the recovery procedure like pointer to the previous log record of the same information. Also when a transaction is started, committed, or aborted, a

begin_transaction, commit, or abort record is written in the log. The Log Write- Ahead Protocol: The writing of a database and writing the corresponding log record are two separate operations. In case of a failure, if the database update were performed before writing the log record, the recovery procedure would be unable of undoing the update, as the corresponding log record would not be available. In order to overcome this, the log write-ahead protocol is used. This consists of two rules: At least the undo portion of the log record must have already been recorded on stable storage before performing a database update of the corresponding log record. All log records of the transaction must have already been recorded on stable storage before committing a transaction. Recovery Procedures: We have already seen the various possibilities of failures in centralized systems and the importance of log record in the recovery procedure. Now let us see how recovery of database is done and
Sikkim Manipal University Page No. 153

Oracle & Distributed Databases

Unit 8

the different steps to be followed. The recovery procedure reads the log record and performs the following operations if the failure is due to the loss of volatile storage. o Step1: Determine all non-committed transactions that have to be undone. This can be recognized easily as they have a begin_transaction record in the log file, without having a commit or abort record. o Step2: Determine all transactions that have to be redone. i.e all the transactions that have a commit record in the log file. In order to differentiate the transactions that have to be redone from the one which are safely recorded in the stable storage, Checkpoints are used. (Checkpoints are the operations that are periodically performed to simplify the step1 and step2). o Step3: Undo the transactions, which are determined in the step1 and redo the transactions that are determined in step2. The checkpoint requires the following operations to be performed: o o All log records and the database updates which are still in the volatile storage has to be recorded in the stable storage Writing to a stable storage a checkpoint record. A checkpoint record in the log contains the indication of transactions that are active at the time when the checkpoint is done (Active transaction is one begin _transaction belongs to the log but not a commit or abort record). The usage of checkpoints modifies step1 and step2 of the recovery procedure as follows o o Find and read the last checkpoint record. Keep all transactions written in the checkpoint record into the undo set, which contains the transactions to be undone. The redo set, which contains the transactions to be redone, is initially empty.

Sikkim Manipal University

Page No. 154

Oracle & Distributed Databases

Unit 8

Read the log file starting from the checkpoint record until the end. If a begin_transaction is found, put the corresponding transaction in transaction from the undo set to the redo set. the undo set. If a commit record is found, move the corresponding

From the above discussion we can say that only the latest portion of the log must be kept online, where as the remaining part can be kept in the stable storage. So far only we have seen the failure of volatile storage. Let us see the failures with the loss of stable storage. This can be studied by considering two possibilities. Failures in which database information is lost, but logs are safe In this case, performing redo of all committed transactions using the log does the recovery. Taking the database to a Dump, which is an image of previous state, which was stored on tape storage, is done before redoing the transactions (Of course it is a lengthy process). Here the log information itself is lost This is a catastrophic event where complete recovery is any way not possible. In this case, it is reestablished to a recent available state, by resetting the database to the last dump and using the log that is not damaged. The principles that we have seen will be sufficient to understand the recovery procedures of distributed databases. Let us now see in detail the recovery in distributed database. 8.3.2 Problems with respect to Communication in the Distributed databases As usual recovery mechanisms for distributed transactions require knowing the communication failures between the sites in distributed system environment. Let us assume one communication model and according to this let us estimate the different possibilities of errors. When a message is
Sikkim Manipal University Page No. 155

Oracle & Distributed Databases

Unit 8

transferred from A to B, we require the following from a communication network: i. A receives a positive acknowledgement after a delay that is less than some maximum value MAX. ii. The message is delivered at B in proper sequence with respect to other A B messages. iii. The message is error free.

There are possibilities that these specifications are really not met because of different types of errors that may occur. The different possibilities are, missing of acknowledgements, late arrival of acknowledgements etc. These can be very easily eliminated by adding advanced design features so that we can assume that: o o Ones the message is delivered at B, then the message is error free and is in sequence with to other received messages. If A receives an acknowledgement, then the message has been delivered. The two possible communication errors are: Lost messages and Network Partitions. If the acknowledgement for a message has not received within some predefined interval called timeout, then the source has to look only for the above errors and try to take the corresponding steps. Multiple Failures and K-resiliency: Failures do not occur one at a time. A system which can tolerate K failures is called K-resilient. In distributed databases, this concept is applied to site failures and/or partitions. With respect to site failures, an algorithm is said to be K- resilient if it works properly even if K sites are down. An extreme case of failure is called Total Failure, where all sites are down.

Sikkim Manipal University

Page No. 156

Oracle & Distributed Databases

Unit 8

8.3.3 Recovery of Distributed Transactions Now let us consider recovery problems in distributed databases. For this purpose, let us assume that at each site a Local Transaction Manager is available. Each agent can issue begin_transaction, commit, and abort primitives to its LTM. After having issued a begin_transaction to its LTM, an agent will possess the properties of a local transaction. We will call an agent that has issued a begin_transaction primitive to its local transaction manager a Sub-transaction. Also to distinguish the begin_transaction, commit, and abort primitives of the distributed transaction from the local primitives issued by each agent to its LTM, we will call the later as local_begin, local_commit, and local_abort. For building a Distributed Transaction Manager (DTM), the following properties are expected from the LTM: Ensuring the atomicity of sub-transaction Writing some records on stable storage on behalf of the distributed transaction manager We need the second requirement, as some additional information must also be recorded in such away that they can be recovered in case of failure. In order to make sure that either all actions of a distributed transaction are performed or none is performed at all, two conditions are necessary: At each site either all actions are performed or none is performed All sites must take the same decision with respect to the commitment or abort of sub transaction.

Sikkim Manipal University

Page No. 157

Oracle & Distributed Databases

Unit 8

Fig. 8.1 shows a reference model of Distributed transaction recovery.

ROOT AGENT

Messages AGENT

Messages AGENT Distribution Transaction

DTMAGENT

Messages

DTMAGENT

Messages

DTMAGENT

Distribution Transaction Manager (DTM)

LTM at Site i

LTM at Site j

LTM at Site i

Local Transaction Manager (LTM)

Interface 1 : Local_begin, Local_commit, Local_Abort, Local_Create Interface 2 : Begin_Transaction, Commit, Abort, Create

Figure 8.1 : A reference model of distributed transaction recovery

Begin_transaction: When it is issued by the root agent, DTM will have to issue a local_begin primitive to the LTM at the site of origin and all the sites at which there are already active agents of the same application, thus transforming all agents into sub-transactions; from this time on the activation of a new agent by the same distributed transaction requires that the local_begin be issued to the LTM where the agent is activated, so that the new agent is created as a Sub-transaction. The example of FUND TRANSFER is taken for explaining this concept. The fig 8.1 explains this.
Sikkim Manipal University Page No. 158

Oracle & Distributed Databases

Unit 8

Abort: When an abort is issued by the root agent, all existing subtransactions must be aborted. Issuing local_aborts to the LTMs at all sites where there is an active sub transaction performs this. Commit: The implementation of the commit primitive is the most difficult and expensive. The main difficulty originates from the fact that the correct commitment of a distributed transaction requires that al sub-transactions commit locally even if there are failures. In order to implement this primitive for a distributed transaction, the general idea of 2-Phase commit Protocol has been developed. It is discussed in detail in the next section. 8.3.4 The Transaction Control Statements The following list describes transaction control statements supported:
COMMIT ROLLBACK (ABORT) SAVEPOINT

Session Trees for Distributed Transactions As the statements in a distributed transaction are issued, the concept of Distributed Transaction defines a session tree of all nodes participating in the transaction. A session tree is a hierarchical model that describes the relationships among sessions and their roles. Fig 8.2 illustrates a session tree. All nodes participating in the session tree of a distributed transaction assume one or more of the following roles:
Role Client Database server Global coordinator Local coordinator Commit point site Description A node that references information in a database belonging to a different node. A node that receives a request for information from another node. The node that originates the distributed transaction. A node that is forced to reference data on other nodes to complete its part of the transaction. The node that commits or rolls back the transaction as instructed by the global coordinator.

Sikkim Manipal University

Page No. 159

Oracle & Distributed Databases

Unit 8

Fig. 8.2: Example of a Session Tree

The role a node plays in a distributed transaction is determined by:


Whether the transaction is local or remote The commit point strength of the node Whether all requested data is available at a node, or whether other nodes need to be referenced to complete the transaction

Whether the node is read-only

Clients: A node acts as a client when it references information from another node's database. The referenced node is a database server. Database Servers: A database server is a node that hosts a database from which a client requests data. In Fig 8.2 an application at the sales node initiates a distributed transaction that accesses data from the warehouse and finance nodes. Therefore, sales.acme.com has the role of client node, and warehouse and finance are both database servers. In this example, sales be a database server and a client because the application also modifies data in the sales database. Local Coordinators: A node that must reference data on other nodes to complete its part in the distributed transaction is called a local coordinator.
Sikkim Manipal University Page No. 160

Oracle & Distributed Databases

Unit 8

In, Fig 9.2 sales are a local coordinator because it coordinates the nodes it directly references: warehouse and finance. The node sales also happen to be the global coordinator because it coordinates all the nodes involved in the transaction. A local coordinator is responsible for coordinating the transaction among the nodes it communicates directly with by: Receiving and relaying transaction status information to and from those nodes passing queries to those nodes Receiving queries from those nodes and passing them on to other nodes Returning the results of queries to the nodes that initiated them

Global Coordinator: The node where the distributed transaction originates is called the global coordinator. The database application issuing the distributed transaction is directly connected to the node acting as the global coordinator. For example, in, Fig 8.2 the transaction issued at the node sales references information from the database servers warehouse and finance. Therefore, sales.acme.com is the global coordinator of this distributed transaction. The global coordinator becomes the parent or root of the session tree. The global coordinator performs the following operations during a distributed transaction:

Sends all of the distributed transaction's SQL statements, remote procedure calls, and so forth to the directly referenced nodes, thus forming the session tree

Instructs all directly referenced nodes other than the commit point site to prepare the transaction

Instructs the commit point site to initiate the global commit of the transaction if all nodes prepare successfully

Sikkim Manipal University

Page No. 161

Oracle & Distributed Databases

Unit 8

Instructs all nodes to initiate a global abort of the transaction if there is an abort response

Commit Point Site: The job of the commit point site is to initiate a commit or roll back (abort) operation as instructed by the global coordinator. The system administrator always designates one node to be the commit point site in the session tree by assigning all nodes commits point strength. The node selected as commit point site should be the node that stores the most critical data. Fig. 8.3 illustrates an example of distributed system, with sales serving as the commit point site: The commit point site is distinct from all other nodes involved in a distributed transaction in these ways

The commit point site never enters the prepared state. Consequently, if the commit point site stores the most critical data, this data never remains in-doubt, even if a failure occurs. In failure situations, failed nodes remain in a prepared state, holding necessary locks on data until in-doubt transactions are resolved.

The commit point site commits before the other nodes involved in the transaction. In effect, the outcome of a distributed transaction at the commit point site determines whether the transaction at all nodes is committed or rolled back: the other nodes follow the lead of the commit point site. The global coordinator ensures that all nodes complete the transaction in the same manner as the commit point site.

Sikkim Manipal University

Page No. 162

Oracle & Distributed Databases

Unit 8

Figure 8.3: Commit Point Site

How a Distributed Transaction Commits? A distributed transaction is considered committed after all non-commit point sites are prepared, and the transaction has been actually committed at the commit point site. The online redo log at the commit point site is updated as soon as the distributed transaction is committed at this node. Because the commit point log contains a record of the commit, the transaction is considered committed even though some participating nodes may still be only in the prepared state and the transaction not yet actually committed at these nodes. In the same way, a distributed transaction is considered not committed if the commit has not been logged at the commit point site. Commit Point Strength: Every database server must be assigned commit point strength. If a database server is referenced in a distributed transaction, the value of its commit point strength determines which role it plays in the two-phase commit. Specifically, the commit point strength determines whether a given node is the commit point site in the distributed transaction
Sikkim Manipal University Page No. 163

Oracle & Distributed Databases

Unit 8

and thus commits before all of the other nodes. This value is specified using the initialization parameter COMMIT_POINT_STRENGTH. This section explains how the system determines the commit point site. The commit point site, which is determined at the beginning of the prepare phase, is selected only from the nodes participating in the transaction. The following sequence of events occurs:

Of the nodes directly referenced by the global coordinator, the software selects the node with the highest commit point strength as the commit point site.

The initially selected node determines if any of the nodes from which it has to obtain information for this transaction has higher commit point strength.

Either the node with the highest commit point strength directly referenced in the transaction or one of its servers with higher commit point strength becomes the commit point site.

After the final commit point site has been determined, the global coordinator sends prepare responses to all nodes participating in the transaction.

Fig 8.2 shows in a sample session tree the commit point strengths of each node (in parentheses) and show the node chosen as the commit point site. The following conditions apply when determining the commit point site:

A read-only node cannot be the commit point site. If multiple nodes directly referenced by the global coordinator have the same commit point strength, then the software designates one of these as the commit point site.

If a distributed transaction ends with an abort, then prepare and commit phases are not needed. Consequently, the software never determines a commit point site. Instead, the global coordinator sends a ABORT

Sikkim Manipal University

Page No. 164

Oracle & Distributed Databases

Unit 8

statement to all nodes and ends the processing of the distributed transaction. As Fig 8.4 illustrates, the commit point site and the global coordinator can be different nodes of the session tree. The commit point strength of each node is communicated to the coordinators when the initial connections are made. The coordinators retain the commit point strengths of each node they are in direct communication with so those commit point sites can be efficiently selected during two-phase commits. Therefore, it is not

necessary for the commit point strength to be exchanged between a coordinator and a node each time a commit occurs.

Fig. 8.4: The commit point site and the global coordinator

Two-Phase Commit Mechanism: Unlike a transaction on a local database, a distributed transaction involves altering data on multiple databases. Consequently, distributed transaction processing is more complicated, because the system must coordinate the committing or rolling back of the changes in a transaction as a self-contained section. In other words, the entire transaction commits, or the entire transaction rolls back (aborts).
Sikkim Manipal University Page No. 165

Oracle & Distributed Databases

Unit 8

The software ensures the integrity of data in a distributed transaction using the two-phase commit mechanism. In the prepare phase, the initiating node in the transaction asks the other participating nodes to promise to commit or roll back the transaction. During the commit phase, the initiating node asks all participating nodes to commit the transaction. If this outcome is not possible, then all nodes are asked to roll back. All participating nodes in a distributed transaction should perform the same action: they should either all commit or all perform a abort of the transaction. The software automatically controls and monitors the commit or abort of a distributed transaction and maintains the integrity of the global database (the collection of databases participating in the transaction) using the twophase commit mechanism. This mechanism is completely transparent, requiring no programming on the part of the user or application developer. The commit mechanism has the following distinct phases, which the software performs automatically whenever a user commits a distributed transaction:
Prepare phase The initiating node, called the global coordinator, asks participating nodes other than the commit point site to promise to commit or roll back the transaction, even if there is a failure. If any node cannot prepare, the transaction is rolled back. If all participants respond to the coordinator that they are prepared, then the coordinator asks the commit point site to commit. After it commits, the coordinator asks all other nodes to commit the transaction. The global coordinator forgets about the transaction.

Commit phase

Forget phase

This section contains the following topics: Prepare Phase The first phase in committing a distributed transaction is the prepare phase. In this phase, the system does not actually commit or roll back the transaction. Instead, all nodes referenced in a distributed transaction

Sikkim Manipal University

Page No. 166

Oracle & Distributed Databases

Unit 8

(except the commit point site), are told to prepare to commit. By preparing, a node:

Records information in the online redo logs so that it can subsequently either commit or roll back the transaction, regardless of intervening failures

Places a distributed lock on modified tables, which prevents reads

When a node responds to the global coordinator that it is prepared to commit, the prepared node promises to either commit or roll back the transaction later but does not make a unilateral decision on whether to commit or roll back the transaction. The promise means that if an instance failure occurs at this point, the node can use the redo records in the online log to recover the database back to the prepare phase. Types of Responses in the Prepare Phase: When a node is told to prepare, it can respond in the following ways:
Prepared Read-only Abort Data on the node has been modified by a statement in the distributed transaction, and the node has successfully prepared. No data on the node has been, or can be, modified (only queried), so no preparation is necessary. The node cannot successfully prepare.

Prepared Response: When a node has successfully prepared, it issues a prepared message. The message indicates that the node has records of the changes in the online log, so it is prepared either to commit or perform a abort. The message also guarantees that locks held for the transaction can survive a failure. Read-Only Response: When a node is asked to prepare, and the SQL statements affecting the database do not change the node's data, the node responds with a read-only message. The message indicates that the node will not participate in the commit phase.

Sikkim Manipal University

Page No. 167

Oracle & Distributed Databases

Unit 8

Note that if a distributed transaction is set to read-only, then it does not use abort segments. If many users connect to the database and their transactions are not set to READ ONLY, then they allocate abort space even if they are only performing queries. Abort Response: When a node cannot successfully prepare, it performs the following actions:

Releases resources currently held by the transaction and roll back the local portion of the transaction.

Responds to the node that referenced it in the distributed transaction with an abort message.

These actions then propagate to the other nodes involved in the distributed transaction so that they can roll back the transaction and guarantee the integrity of the data in the global database. This response enforces the primary rule of a distributed transaction: all nodes involved in the transaction either all commit or all roll back the transaction at the same logical time. Steps in the Prepare Phase: To complete the prepare phase, each node excluding the commit point site performs the following steps:

The node requests that its descendants, that is, the nodes subsequently referenced, prepare to commit.

The node checks to see whether the transaction changes data on itself or its descendants. If there is no change to the data, then the node skips the remaining steps and returns a read-only response

The node allocates the resources it needs to commit the transaction if data is changed.

The node saves redo records corresponding to changes made by the transaction to its online redo log.

The node guarantees that locks held for the transaction are able to survive a failure.

Sikkim Manipal University

Page No. 168

Oracle & Distributed Databases

Unit 8

The node responds to the initiating node with a prepared response or, if its attempt or the attempt of one of its descendents to prepare was unsuccessful, with an abort response.

These actions guarantee that the node can subsequently commit or roll back the transaction on the node. The prepared nodes then wait until a COMMIT or ABORT request is received from the global coordinator.

After the nodes are prepared, the distributed transaction is said to be indoubt .It retains in-doubt status until all changes are either committed or aborted. Commit Phase: The second phase in committing a distributed transaction is the commit phase. Before this phase occurs, all nodes other than the commit point site referenced in the distributed transaction have guaranteed that they are prepared, that is, they have the necessary resources to commit the transaction. Steps in the Commit Phase: The commit phase consists of the following steps:

The global coordinator instructs the commit point site to commit. The commit point site commits. The commit point site informs the global coordinator that it has committed.

The global and local coordinators send a message to all nodes instructing them to commit the transaction.

At each node, the system commits the local portion of the distributed transaction and releases locks.

At each node, the system records an additional redo entry in the local redo log, indicating that the transaction has committed.

Sikkim Manipal University

Page No. 169

Oracle & Distributed Databases

Unit 8

The participating nodes notify the global coordinator that they have committed.

Guaranteeing Global Database Consistency: Each committed transaction has an associated system change number (SCN) to uniquely identify the changes made by the SQL statements within that transaction. The SCN functions as an internal .The system timestamp that uniquely identifies a committed version of the database. In a distributed system, the SCNs of communicating nodes are coordinated when all of the following actions occur:

A connection occurs using the path described by one or more database links

A distributed SQL statement executes A distributed transaction commits

During the prepare phase, the system determines the highest SCN at all nodes involved in the transaction. The transaction then commits with the high SCN at the commit point site. The commit SCN is then sent to all prepared nodes with the commit decision. Forget Phase: After the participating nodes notify the commit point site that they have committed, the commit point site can forget about the transaction. The following steps occur:

After receiving notice from the global coordinator that all nodes have committed, the commit point site erases status information about this transaction.

The commit point site informs the global coordinator that it has erased the status information.

The global coordinator erases its own information about the transaction.

Sikkim Manipal University

Page No. 170

Oracle & Distributed Databases

Unit 8

Response of 2-Phase Commit protocol for Failures: It is tough to all failures in which no log information is lost. The response of the protocol in the presence of failure is now discussed.
1. Site Failure:

A participant fails before having written the ready record in the log. In this case, the coordinators timeout expires, and it takes the abort decision.

A participant fails after written the ready record in the log. In this case, the operational sites correctly terminate the transaction (abort or commit). When the failed site recovers, the restart procedure has to ask the coordinator or some other participant about the result of the transaction.

The coordinator fails after having written the prepare record in the log before having written the log, but before having written a global_commit or global_abort record in the log. In this case all participants who have already answered READY message must wait for the recovery of the coordinator. The restart procedure of the coordinator resumes the commit protocol from the beginning, reading the identity of the participants from the prepare record in the log, and sending again prepare message to them. Each ready participant must recognize that the new PREPARE message is a repetition of the previous one.

The coordinator fails after having written a global_commit or global_abort record in the log, but before having written the complete record in the log. In this case, the coordinator must send the decision again to all participants; participants who have not received the command have to wait until the coordinator recovers.

Sikkim Manipal University

Page No. 171

Oracle & Distributed Databases

Unit 8

The coordinator fails before having written the complete record in the log. In this case, the transaction was already concluded, and no action is required at start.

2. Lost Messages An answer message (READY or ABORT) from a participant is lost. In this case the coordinators time out expires, and the whole transaction is aborted. A PREPARE message is lost. In this case, the participant remains in wait. The global result is same as the previous one as the coordinator does not receive an answer. A command message is lost i.e either COMMIT or ABORT. In this case, the destination participant remains uncertain about the decision. Having a timeout in the participant; if no command has been received after the timeout interval from the answer can eliminate this problem, a request for the repetition of the command is sent. An ACK message is lost. In this case, the coordinator remains uncertain about the fact that the participant has been received the command message. Introducing a timeout in the coordinator can eliminate this problem; if no ACK message is received after the timeout interval from the transmission of the command, coordinator will send the command again. 3. Network Partitions: Let us suppose that a simple partition occurs, dividing the sites in two groups the group contains the coordinator is called the coordinator group the other the participant group. From the view of the coordinator the partition is equivalent to the multiple failure of as set of participants, and the solution is already discussed. From the view of the participant the failure is equivalent to a coordinator failure and the situation is similar to the case already discussed.
Sikkim Manipal University Page No. 172

the

Oracle & Distributed Databases

Unit 8

It has been observed that the recovery procedure for a site that is involved in processing a distributed transaction is more complex than that for a centralized database. Self Assessment Questions 8.3

1. A contains information for undoing or redoing all the actions


performed by the transactions.

2. The two possible communication errors are, Lost messages and . 3. The node where the distributed transaction originates is called the . 4. If a database server is referenced in a distributed transaction, the value
of its commit point strength determines which role it plays in the .

5. The global coordinator forgets about the transaction phase is called 6. After the nodes are prepared, the distributed transaction is said to
be .

8.4 Summary
In this unit we define the properties of transactions, state the goals of distributed transaction management and describe architecture of distributed transaction. In this unit we are trying to experience the concept of atomicity, which is a required property for distributed transactions. In the first section we discuss recovery techniques in centralized system. Next we shall deal the possible communication failures in distributed transactions. After this let us concentrate on recovery procedures followed in distributed transactions. Finally the section talks about a distributed transaction algorithm called 2Phase commit Protocol, which actually take care all the properties expected by distributed transactions.

8.5

Terminal Questions

1. List the ACID properties. Explain the usefulness of each 2. What are the different goals of transaction management? Explain
Sikkim Manipal University Page No. 173

Oracle & Distributed Databases

Unit 8

3. Suppose that there is a database system that never fails. Is a recovery manager required for this system? 4. List out the Problems with respect to Communication in the Distributed databases 5. Explain the steps involved in recovery of distributed transactions 6. How the commit point site is distinct from all other nodes involved in a distributed transaction? 7. Consider a failure that occurs during 2 Phase Commit(2PC) for a transaction. a. List the possible types of failure in a distributed transaction b. Explain how 2PC ensures transaction atomicity despite the failure

8.6 Answers to Self Assessment Questions


Answers to Self Assessment Questions 8.2 1. Transaction recovery 2. Commit 3. Database recovery Answers to Self Assessment Questions 8.3 1. Log 2. Network Partitions 3. global coordinator 4. two-phase commit 5. forget phase 6. in-doubt

Sikkim Manipal University

Page No. 174

Você também pode gostar