DB2 Universal Database V8.1 Certification Exam 700 Study Guide
In Chapter 5, "Working with DB2 UDB Data," we saw that a transaction ( otherwise known as a unit of work) is a recoverable sequence of one or more SQL operations grouped together as a single unit, usually within an application process. The initiation and termination of a single transaction defines points of data consistency within a database ”either the effects of all SQL operations performed within a transaction are applied to the database and made permanent (committed) or the effects of all SQL operations performed are completely "undone" and thrown away (rolled back). In single- user , single-application environments, each transaction runs serially and does not have to contend with interference from other transactions. However in multiuser environments, transactions can execute simultaneously , and each transaction has the potential to interfere with any other transaction that has been initiated but not yet terminated . Transactions that have the potential of interfering with one another are said to be interleaved, or parallel, while transactions that run isolated from each other are said to be serializable, which means that the results of running them simultaneously will be no different from the results of running them one right after another (serially). Ideally, every transaction should be serializable. Why is it important that transactions be serializable? Consider the following: Suppose a travel agent is entering hotel reservation information into a database system at the same time a hotel manager is checking room availability for a conference planning committee. Now, suppose the travel agent blocks off two hundred rooms for a large tour group (to check availability get a price quote) but does not commit the entry. While the travel agent is relaying the price quote information to the tour group coordinator, the hotel manager queries the database to see how many rooms are available, sees that all but twenty rooms have been reserved, and tells the conference planning committee that he cannot accommodate their needs. Now, suppose the tour coordinator decides not to reserve the rooms because the quoted price is higher than anticipated. The travel agent rolls back the transaction because no reservations were made, and the two hundred rooms that had been marked as reserved are now shown as being available. Unfortunately, the damage has already been done. The hotel missed the opportunity to host a conference, and they have two hundred vacant rooms they need to fill. If the travel agent's transaction and the hotel manager's transaction had been isolated from each other (serialized), this problem would not have occurred. Either the travel agent's transaction would have finished before the hotel manager's transaction started, or the hotel manager's transaction would have finished before the travel agent's transaction started; in either case, the hotel would not have missed out on the opportunity to host the conference. When transactions are not isolated from each other in multiuser environments, the following types of events (or phenomena) can occur: Lost Updates. This event occurs when two transactions read the same data, both attempt to update that data, and one of the updates is lost. For example: Transaction A and Transaction B both read the same row of data and calculate new values for that row based upon the original values read. If Transaction A updates the row with its new value and Transaction B then updates the same row, the update operation performed by Transaction A is lost. Dirty Reads. This event occurs when a transaction reads data that has not yet been committed. For example: Transaction A changes a row of data and Transaction B reads the changed row before Transaction A commits the change. If Transaction A rolls back the change, Transaction B will have read data that theoretically never existed. Nonrepeatable Reads. This event occurs when a transaction reads the same row of data twice, but gets different results each time. For example: Transaction A reads a row of data, and then Transaction B modifies or deletes that row and commits the change. When Transaction A attempts to reread the row, it will retrieve different data values (if the row was updated) or discover that the row no longer exists (if the row was deleted). Phantoms. This event occurs when a row of data matches some search criteria but initially is not seen. For example: Transaction A retrieves a set of rows that satisfy some search criteria, and then Transaction B inserts a new row that contains matching search criteria for Transaction A's query. If Transaction A re-executes the query that produced the original set of rows, a different set of rows will be retrieved ”the new row added by Transaction B will now be included in the set of rows returned. Because several different users can access and modify data stored in a DB2 UDB database at the same time, the DB2 Database Manager must be able to allow users to make necessary changes while ensuring data integrity is never compromised. The sharing of resources by multiple interactive users or application programs at the same time is known as concurrency. One of the ways DB2 UDB enforces concurrency is through the use of isolation levels, which determine how data used in one transaction is "isolated from" other transactions. DB2 Universal Database recognizes and supports the following isolation levels:
Table 7-1 shows the various phenomena that can occur when each of these isolation levels are used. Table 7-1. DB2 Universal Database's Isolation Levels and the Phenomena That Can Occur When Each Is Used
The Repeatable Read Isolation Level
The Repeatable Read isolation level completely isolates one transaction from the effects of other concurrent transactions. When this isolation level is used, every row that is referenced in any manner by the isolated transaction is "locked" for the duration of that transaction. As a result, if the same SELECT SQL statement is issued two or more times within the same transaction, the result data set produced will always be the same. (Lost updates, dirty reads, nonrepeatable reads, and phantoms cannot occur.) In addition, transactions using the Repeatable Read isolation level will not see changes made to other rows by other transactions until those changes have been committed. Transactions using the Repeatable Read isolation level can retrieve the same set of rows multiple times and perform any number of operations on them until terminated by performing either a commit or a rollback operation. However, no other transaction is allowed to perform any insert, update, or delete operation that would affect the set of rows being accessed by the isolating transaction ”as long as that transaction remains active. To ensure that the data being accessed by a transaction running under the Repeatable Read isolation level is not adversely affected by other transactions, each row referenced by the isolating transaction is locked ”not just the rows that are actually retrieved and/or modified. Thus, if a transaction scans 1,000 rows in order to retrieve 10, locks are acquired and held on all 1,000 rows scanned ”not just on the 10 rows retrieved.
So how does this isolation level work in a real-world situation? Suppose you own a large hotel and you have a Web site that allows individuals to reserve rooms on a first-come, first- served basis. If your hotel reservation application runs under the Repeatable Read isolation level, whenever a customer retrieves a list of all rooms available for a given range of dates, you will not be able to change the room rate for those rooms during the date range specified, nor will other customers be able to make or cancel reservations that would cause the list to change if it were generated again ”as long as the transaction that produced the list is active. (However, you can change room rates for any room that was not scanned in response to the first customer's query. Likewise, other customers can make or cancel room reservations for any room that was not scanned in response to the first customer's query.) The Read Stability Isolation Level
Unlike the Repeatable Read isolation level, the Read Stability isolation level does not completely isolate one transaction from the effects of other concurrent transactions. That is because when the Read Stability isolation level is used, only rows that are actually retrieved by a single transaction are locked for the duration of that transaction. Thus when this isolation level is used, if the same SELECT SQL statement is issued two or more times within the same transaction, the result data set produced may not always be the same. (Lost updates, dirty reads, and nonrepeatable reads cannot occur; phantoms, however, can and may be seen.) In addition, transactions using the Read Stability isolation level will not see changes made to other rows by other transactions until those changes have been committed. Transactions using the Read Stability isolation level can retrieve a set of rows and perform any number of operations on them until terminated by performing either a commit or a rollback operation. However, no other transaction is allowed to perform any update or delete operation that would affect the set of rows that were retrieved by the isolating transaction ”as long as that transaction exists. (However, other transactions can perform insert operations, and if the transaction running under the Read Stability isolation level executes the same query multiple times, rows inserted between each query by other concurrent transactions may appear in subsequent result data sets produced. As mentioned earlier, such rows are called "phantoms.") Unlike the Repeatable Read isolation level, where every row that is referenced in any way by the isolating transaction is locked, when the Read Stability isolation level is used, only the rows that are actually retrieved and/or modified by the isolating transaction are locked. Thus, if a transaction scans 1,000 rows in order to retrieve 10, locks are only acquired and held on the 10 rows retrieved ”not on all 1,000 rows scanned. (And because fewer locks are acquired, more transactions can run concurrently.) So how does this isolation level change the way our hotel reservation application works? Now when a customer retrieves a list of rooms available for a given range of dates, you will be able to change the room rate for any room in the hotel that does not appear on the list, and other customers will be able to cancel room reservations for rooms that had been reserved for the date range specified by the first customer's query. Therefore, if the customer generates the list of available rooms again (before the transaction that submitted the query terminates), the list produced may contain new room rates and/or rooms that were not available the first time the list was generated. The Cursor Stability Isolation Level
The Cursor Stability isolation level is even more relaxed than the Read Stability isolation level in the way it isolates one transaction from the effects of other concurrent transactions. When the Cursor Stability isolation level is used, only the row that is currently being referenced by a cursor is locked. The lock acquired remains in effect until the cursor is repositioned ”more often than not by executing the FETCH SQL statement ”or until the isolating transaction terminates. (If the cursor is repositioned, the lock being held on the last row read is released and a new lock is acquired for the row the cursor is now positioned on.). When a transaction using the Cursor Stability isolation level retrieves a row from a table via an updatable cursor, no other transaction can update or delete that row while the cursor is positioned on it. However, other transactions can add new rows to the table, as well as perform update and/or delete operations on rows positioned on either side of the locked row, provided the locked row itself was not accessed using an index. Furthermore, if the isolating transaction modifies any row it retrieves, no other transaction can update or delete that row until the isolating transaction is terminated, even when the cursor is no longer positioned on the modified row. As you might imagine, when the Cursor Stability isolation level is used, if the same SELECT SQL statement is issued two or more times within the same transaction, the results returned may not always be the same. (Lost updates and dirty reads cannot occur; nonrepeatable reads and phantoms, on the other hand, can and may be seen.) In addition, transactions using the Cursor Stability isolation level will not see changes made to other rows by other transactions until those changes have been committed. Once again, let us see how this isolation level affects our hotel reservation application. Now when a customer retrieves a list of rooms available for a given range of dates, then views information about each room on the list produced (one room at a time), you will be able to change the room rate over any date range for any room in the hotel with the exception of the room the customer is currently looking at. Likewise, other customers will be able to make or cancel reservations over any date range for any room in the hotel; however, they will not be able to do anything with the room the first customer is currently looking at. When the first customer views information about another room in the list, the same holds true for the new room the customer is looking at; you will now be able to change the room rate for the room the first customer was just looking at and other customers will be able to reserve that particular room ”provided the first customer did not reserve the room for themselves . The Uncommitted Read Isolation Level
While the Repeatable Read isolation level is the most restrictive of the isolation levels available, the Uncommitted Read isolation level is the least intrusive isolation level provided. In fact, when the Uncommitted Read isolation level is used, rows that are retrieved by a single transaction are only locked if another transaction attempts to drop or alter the table from which the rows were retrieved. Because rows often remain unlocked when this isolation level is used, dirty reads, nonrepeatable reads, and phantoms can occur. Therefore, the Uncommitted Read isolation level is commonly used for transactions that access read-only tables/views or transactions that execute queries on which uncommitted data from other transactions will have no adverse affect. In most cases, transactions using the Uncommitted Read isolation level can read changes made to rows by other transactions before those changes have been committed. However, such transactions can neither see nor access tables, views, or indexes created by other concurrent transactions until those transactions themselves have been terminated. The same applies to existing tables, views, or indexes that have been dropped ”transactions using the Uncommitted Read isolation level will only learn that these objects no longer exist when the transaction that dropped them is terminated. There is one exception to this behavior: When a transaction running under the Uncommitted Read isolation level uses an updatable cursor, the transaction will behave as if it is running under the Cursor Stability isolation level, and the constraints of the Cursor Stability isolation level will apply. So how does the Uncommitted Read isolation level affect our hotel reservation application? Now when a customer retrieves a list of rooms available for a given range of dates, you will be able to change the room rate for any room in the hotel, and other customers will be able to make or cancel reservations over any date range for any room. Furthermore, the list produced for the first customer may contain rooms that other customers have chosen to cancel reservations for, but whose cancellations have not yet been committed to the database. Choosing the Proper Isolation Level
In addition to controlling how well the DB2 Database Manager provides concurrency, the isolation level used also determines how well applications running concurrently will perform. As a result, using the wrong isolation level for a given situation can have a significant negative impact on both concurrency and performance. So how do you determine which isolation level to use for a given situation? You start by identifying the concurrency phenomena that can arise and determining what phenomena are tolerable and what phenomena are not. Then you select an isolation level that will prevent any unacceptable phenomena from occurring. Typically, you should:
Specifying the Isolation Level to Use
Although isolation levels control concurrency at the transaction level, they are actually set at the application level. Therefore in most cases, the isolation level specified for a particular application is applicable to every transaction initiated by that application. (It is important to note that an application can be constructed in several different parts , and each part can be assigned a different isolation level, in which case the isolation level specified for a particular part is applicable to every transaction that is created within that part.) For embedded SQL applications, the isolation level to be used is specified at precompile time or when the application is bound to a database (if deferred binding is used). The isolation level for embedded SQL applications written in a supported compiled language (such as C and C++) is set through the ISOLATION option of the PRECOMPILE PROGRAM and BIND commands. The isolation level for Open Database Connectivity (ODBC) and Call Level Interface (CLI) applications is set at application runtime by calling the SQLSetConnectAttr() function with the SQL_ATTR_TXN_ISOLATION connection attribute specified. Alternatively, the isolation level for ODBC/CLI applications can be set by assigning a value to the TXNISOLATION keyword in the db2cli.ini configuration file; however, this approach does not provide the flexibility of changing isolation levels for different transactions within the application that the first approach does. The isolation level for JDBC and SQLJ applications is set at application runtime by calling the setTransactionIsolation() method that resides within DB2 Universal Database's java.sql connection interface. When the isolation level for an application is not explicitly set using one of these methods , the Cursor Stability isolation level is used as the default. This holds true for commands, SQL statements, and scripts executed from the Command Line Processor, as well as for embedded SQL, ODBC/CLI, JDBC, and SQLJ applications. Therefore, it is also possible to specify the isolation level to be used for any transaction that is to be executed by the Command Line Processor. In this case, the isolation level is set by executing the CHANGE ISOLATION command before a connection to a database is established. |