Understanding SQL Transactions and Transaction Logs

Understanding SQL Transaction Processing

A database transaction is a sequence of one or more statements that must all be executed successfully in order to complete a specific task. SQL-92 specifies that the DBMS must provide a BEGIN TRANSACTION and an END TRANSACTION statement you can use to group commands that the DBMS is to execute atomically-either execute all of the statements in the group (bracketed by the BEGIN TRANSACTION and END TRANSACTION statements) or execute none of them.

If the DBMS has executed some of the statements in a transaction and the application program aborts or there is some other catastrophic hardware or software failure, the DBMS is responsible for undoing the tasks (statements) it has already executed and restoring the database tables back to their original, unmodified condition. In Tip 130, "Using the ROLLBACK Statement to Undo Changes Made to Database Objects," you will learn how use the ROLLBACK statement to take advantage of the ability of the DBMS to "undo" modifications if you change your mind and want to "back out" updates and deletes that executed successfully during a transaction.

SQL statements grouped together (between BEGIN TRANSACTION and END TRANSACTION statements) to form a transaction are typically interdependent and must all be executed in order to maintain database consistency. Suppose, for example, that you were changing the item number of a product in your inventory. To do so, you would have to:

If any of the updates in the transaction are not done, the database will become inconsistent. For example, if the DBMS fails to change the PRODUCT_CODE value in the INVENTORY table, the SELECT statement

SELECT product_code, description, cost, vendor, serial_number FROM inventory, products WHERE inventory.product_code = products.product_code

will no longer produce correct results because the PRODUCT_CODE in the INVENTORY table would reference a nonexistent PRODUCT_CODE in the PRODUCTS table (which holds the DESCRIPTION, COST, and VENDOR for every product in the inventory).

From the current example, you can see that each statement in a transaction performs a part of a task, such as changing the PRODUCT_CODE code in one of several tables. Moreover, all of tasks must be completed in order to successfully finish the overall job at hand, such as changing the PRODUCT_CODE from one value to another throughout the entire database.

The important thing to understand about transaction processing is that the DBMS uses it to maintain the consistency of the database. Since the DBMS treats the statements in a transaction as an atomic unit of work, it will either execute all of the statements successfully or make it appear as if none of the statements was executed by restoring all data and database objects modified by statements in the transaction back to their original, unmodified states.

Tip 128, "Understanding the ANSI/ISO Transaction Model," will explain how most commercial database products automatically detect the start and end of an SQL transaction, and Tip 131, "Understanding the MS-SQL Server Transaction Model," will teach you how to group statements into transactions for MS-SQL Server using the BEGIN TRANSACTION and END TRANSACTION statements.

Understanding the ANSI ISO Transaction Model

In Tip 127, "Understanding SQL Transaction Processing," you learned that a transaction is a sequence of SQL statements that the DBMS views as a single unit of work. Said another way, the statements in a transaction are considered an atomic unit of work-either all are executed successfully or any work performed is undone so that the database appears as if none of the statements in the transaction was executed.

The ANSI/ISO transaction model defines the roles of the COMMIT and ROLLBACK statements (which we will discuss in Tips 129 and 130, "Understanding When to Use the COMMIT Statement") and specifies that a transaction begins automatically with the first statement submitted to the DBMS by the user or application program. As such, a user working with a DBMS that follows the ANSI/ISO transaction model (DB2, for example) is always operating within a transaction.

An ANSI/ISO transaction begins automatically with the execution of an SQL statement and continues through subsequent statements until terminated in one of the four ways shown in Figure 128.1.

Figure 128.1: Database states after successful and unsuccessful transaction processing

Note that the defining characteristic of the ANSI/ISO Transaction model is that no explicit action is required to start a transaction. The user or application program is always in a transaction because a new transaction begins as soon as the current transaction ends with one of the following events:

The ANSI/ISO transaction model does not lend itself to interactive multi-statement transaction processing. In fact, most DBMS products default their interactive sessions to auto-commit mode, in which the DBMS automatically starts a transaction when the user submits a Data Definition Language (DDL) or Data Manipulation Language (DML) statement, such as

UPDATE products SET sales_price = sales_price * 1.2

and then ends the transaction with a COMMIT as soon as the DBMS successfully performs its task. Because the DBMS issues a COMMIT as soon as the SQL statement finishes its execution, you cannot use a ROLLBACK to undo the work performed by a successfully executed statement during an interactive session in auto-commit mode. Moreover, as soon as you submit the next statement for processing, the DBMS starts a new transaction. As such, although you may want to undo both statements if the second statement fails to execute, the DBMS will roll back only the work from the failed second statement. (By the time the second statement executes, the DBMS already made the first statement's work permanent with a COMMIT.)

Tip 129 and Tip 130, "Using the ROLLBACK Statement to Undo Changes Made to Database Objects," will show you how to take an interactive session out of auto-commit mode. After you turn off the auto-commit, you will be able to execute multi-statement transactions and roll back the work performed by one or more statements you submit during an interactive session.

The important thing to know now is that the ANSI/ISO transaction model (which defines the roles of COMMIT and ROLLBACK statements) specifies that an SQL transaction begins automatically with the first statement executed by a user or application program. In Tip 131, "Understanding the MS-SQL Server Transaction Model," you will learn that the non-ANSI/ISO transaction model still defaults to auto-commit mode for an interactive session but requires programs to execute an explicit BEGIN TRANSACTION statement to start a transaction in programmatic SQL.

Understanding When to Use the COMMIT Statement

During SQL transaction processing, work performed by statements can be undone (with a ROLLBACK) until it is permanently written to the database. The COMMIT statement tells the DBMS to make database changes permanent.

As you learned in Tip 128, "Understanding the ANSI/ISO Transaction Model," most DBMS products default to auto-commit mode when accepting interactive queries and updates. As such, when you type the UPDATE statement

UPDATE products SET sales_price = sales_price * 1.2

into the MS-SQL Server Query Analyzer's (upper) input pane and then click on the Execute Query button, as shown in Figure 129.1, the DBMS will increase the SALES_PRICE of all products in the PRODUCTS table by 20 percent.

Figure 129.1: An MS-SQL Server's Query Analyzer interactive session

Since the Query Analyzer provides interactive connections to an MS-SQL Server database (SQL Tips, in the current example), the DBMS automatically issues a COMMIT statement after the UPDATE statement makes its last change and returns control to the Query Analyzer. Thus, when working in interactive auto-commit mode, you need not execute a COMMIT statement to have work permanently written to the database.

As with any transaction processing, if the transaction started by submitting an SQL statement to the DBMS during an interactive session does not run to a successful completion, the DBMS will roll back (undo) any work performed. Thus, in the current example, if the server loses power or locks up after updating the SALES_PRICE for half of the products in the PRODUCTS table, the DBMS will automatically set the SALES_PRICE of all products back to what it was prior to the 20 percent increase.

  Note 

Obviously, you'll have to bring the server back online before it can roll back the partial update. After the server restarts the DBMS, the DBMS will check its transaction log for any partially executed and uncommitted transactions. Prior to letting any users or application programs connect to the database, the DBMS will reverse (ROLLBACK) any work done by both aborted and uncommitted transactions.

The important thing to know is that when the DBMS executes a ROLLBACK, it undoes all work performed on the database subsequent to the last COMMIT. When executing SQL statements in interactive mode, the ROLLBACK reverses only the changes made by the most recent partially executed statement. (Remember, a fully executed statement is automatically committed during an interactive session in the default auto-commit mode.)

Executing an explicit COMMIT statement becomes important in programmatic SQL, in which you use an application program to send SQL statements to the DBMS. During a programmatic SQL session, you start a transaction with a BEGIN TRANSACTION statement.

  Note 

As you learned in Tip 128, a DBMS that follows the ANSI/ISO transaction model will start a transaction automatically (without an explicit BEGIN TRANSACTION) as soon as the application program submits its first SQL statement.

All statements submitted by the application program become part the transaction-which does not end until the application either submits a ROLLBACK or COMMIT statement, or ends (either normally or abnormally). Thus, if you have an application program that has the statement flow

BEGIN TRANSACTION MOD 1... MOD 2... MOD 3... MOD 4...

the four modifications are not committed (made permanent) until the application program ends normally. As such, an abnormal ending will cause the DBMS to roll back (undo) all modifications made by all four of the SQL statements in the body of the application. If you want the first two modifications to remain in the database without regard to the successful execution of modifications 3 and 4, have the application program execute a COMMIT statement after submitting MOD 2:

BEGIN TRANSACTION MOD 1... MOD 2... COMMIT TRANSACTION BEGIN TRANSACTION MOD 3... MOD 4...

  Note 

If you are using a DBMS that uses the ANSI/ISO transaction model, the program need not execute an explicit BEGIN TRANSACTION statement before MOD 1. As you learned in Tip 128, the ANSI/ISO transaction model DBMS automatically starts a new transaction with the first SQL statement following a COMMIT or a ROLLBACK.

The important thing to know now is that the COMMIT statement signals the successful completion of an SQL transaction and causes the DBMS to make permanent the work performed by statements in the transaction. Committed work cannot be reversed with a ROLLBACK and is not undone when the DBMS is restarted after either a normal or an abnormal shutdown.

Using the ROLLBACK Statement to Undo Changes Made to Database Objects

In Tip 129, "Understanding When to Use the COMMIT Statement," you learned how to make work performed as part of an SQL transaction permanent by executing a COMMIT statement. Prior to being written permanently to the database, you can use the ROLLBACK statement to reverse changes made by almost all SQL statements-including, in MS-SQL Server, bringing back a table and data removed from the database with a DROP TABLE statement!

During an interactive session, the DBMS defaults to auto-commit mode, which means that the DBMS automatically commits (makes permanent) the action of every SQL statement that it successfully executes. Therefore, if you execute the statement

DELETE FROM employees

which removes all rows from the EMPLOYEES table, you cannot bring the deleted data back with the statement:

ROLLBACK TRANSACTION

Remember, the DBMS starts the transaction when it receives the DELETE statement and then automatically executes a COMMIT TRANSACTION as soon is it has removed the last row from the EMPLOYEES table. As such, there is no uncommitted transaction to roll back, even if you execute the ROLLBACK TRANSACTION statement immediately after executing the DELETE statement.

If you want to be able to undo changes made by SQL statements you execute during an interactive session, you must first take the interactive session out of auto-commit mode.

MS-SQL Server, for example, lets you turn off auto-commit mode by executing the statement:

BEGIN TRANSACTION

  Note 

You will need to check your system documentation for the specific statement your DBMS uses to turn off auto-commit mode. While MS-SQL Server uses BEGIN TRANSACTION, DB2 uses UPDATE COMMAND OPTIONS USING c OFF. Look in the index under "transaction," "commit," or "auto-commit."

After you execute the BEGIN TRANSACTION statement, the DBMS treats subsequent SQL statements as being part of a multi-statement transaction. As such, if you execute the statement sequence

MOD 1 BEGIN TRANSACTION MOD 2 MOD 3 MOD 4 ROLLBACK TRANSACTION

the DBMS will undo the work perform by the three SQL statements (MOD 2, MOD 3, MOD 4) that follow the BEGIN TRANSACTION statement. Therefore, after executing the ROLLBACK TRANSACTION statement, the database will look exactly like it did when it finished executing the SQL statement denoted as MOD 1 in the current example.

  Note 

The ROLLBACK TRANSACTION did not undo the work performed by MOD 1 because MOD 1 was executed in auto-commit mode. As a result, changes made by MOD 1 were automatically committed (made permanent) upon the successful completion of the SQL statement MOD 1 represents.

What may become a bit confusing is that if the DBMS follows the ANSI/ISO transaction model (which MS-SQL Server and Sybase do not), executing the example's SQL statements in an interactive session will have a different effect than using programmatic SQL to submit the same sequence of statements to the DBMS through an application program.

During an interactive session, the ROLLBACK TRANSACTION will have the same effect whether executed using interactive or programmatic SQL-the database will keep only changes performed by the SQL statement represented by MOD 1.

If the example statements are executed in an application program on a DBMS that follows the ANSI/ISO transaction model, the ROLLBACK TRANSACTION will undo changes made by MOD 1 along with those made by MOD 2, MOD 3, and MOD 4. However, if the example statements are executed in an application program on a DBMS that does not follow the ANSI/ISO transaction model (such as MS-SQL Server and Sybase), the ROLLBACK will not undo changes made by MOD 1, along with those made by MOD 2, MOD 3, and MOD 4.

The reason for the difference in behavior is that the ANSI/ISO transaction model automatically starts the transaction with the first statement submitted by the application program. Thus, MOD 1 is part of the transaction affected by the ROLLBACK TRANSACTION statement when submitted by an application program running on a DBMS that follows the ANSI/ISO transaction model.

When using a ROLLBACK to undo database changes, the important thing to understand is that the ROLLBACK statement reverses (undoes) all the work performed after the execution of the most recent COMMIT statement. Therefore, if you are executing SQL statements within an interactive session and want to be able to undo work by executing a ROLLBACK statement or want to execute multi-statement transactions, be sure to turn off the default auto-commit mode.

  Note 

Consider using explicit BEGIN TRANSACTION and END TRANSACTION blocks (where possible) to make your SQL code transportable between DBMS servers that adhere to the ANSI/ISO transaction model (such as DB2) and those that do not (such as Sybase and MS-SQL Server).

Understanding the MS SQL Server Transaction Model

As you learned in Tip 127, "Understanding SQL Transaction Processing," and Tip 128, "Understanding the ANSI/ISO Transaction Model," an SQL transaction is a sequence of statements, all of which must be executed successfully in order to complete a task. All DBMS products perform the work specified by the statements in a transaction on an all-or-nothing basis—there is no partial execution. Either the DBMS successfully executes all statements in a transaction, or the system undoes any work performed and restores the database to its state prior to the start of the transaction. Moreover, all DBMS products support a ROLLBACK statement that will undo work performed in a transaction not yet made permanent by a COMMIT statement.

The MS-SQL Server transaction model is similar to the ANSI/ISO transaction model in that it:

In a departure from the ANSI/ISO transaction model, however, MS-SQL Server:

MS-SQL Server savepoints are especially useful in transactions with a large number of statements in that they allow the application program to undo work and back up to specific points in a transaction. As such, the application program can use a ROLLBACK to a save-point to redo statements in a portion of a transaction without having to start over from the beginning. For example, if you have MS-SQL Server execute the following statements

BEGIN TRANSACTION CREATE TABLE trans_table (row_number SMALLINT, description VARCHAR(35)) INSERT INTO trans_table VALUES (1, 'Inserted Row 1') INSERT INTO trans_table VALUES (2, 'Inserted Row 2') SAVE TRANSACTION savepoint_1 DELETE FROM trans_table WHERE row_number = 2 INSERT INTO trans_table VALUES (3, 'Inserted Row 3') INSERT INTO trans_table VALUES (4, 'Inserted Row 4') SAVE TRANSACTION savepoint_2 DELETE FROM trans_table WHERE row_number = 1 DELETE FROM trans_table WHERE row_number = 3 INSERT INTO trans_table VALUES (5, 'Inserted Row 5') UPDATE trans_table SET description = 'All Rows Updated' ROLLBACK TRANSACTION savepoint_2 UPDATE trans_table SET description = 'Row 1 After ROLLBACK to 2' DELETE trans_table WHERE row_number = 4 COMMIT TRANSACTION

the results table for the SELECT statement

SELECT * FROM trans_table

is:

row_number description ---------- ------------------------- 1 Row 1 After ROLLBACK to 2 3 Inserted Row 3

Obviously, unlike the example, an actual application program will use some sort of decision mechanism such as the Transact-SQL statement IF @@ERROR or IF @@ROWCOUNT to determine whether to roll back (undo) work performed subsequent to a particular save-point. Moreover, the application (or stored procedure) will most likely be more complex and perform useful work. However, the example does demonstrate how a savepoint can be used to undo a portion of the work performed by statements in a transaction, as illustrated in Figure 131.1.

Figure 131.1: MS-SQL Server transaction processing using a savepoint

Notice that the ROLLBACK removes the effects of only the two DELETE statements, the INSERT statement, and the UPDATE statement executed after the statement:

SAVE TRANSACTION savepoint_2

The important thing to know is that the SAVE TRANSACTION statement lets you label savepoints within the sequence of statements that make up a transaction. You can then undo statements executed subsequent to a particular savepoint by using the ROLLBACK statement and naming the savepoint to which you want the database restored. In other words, you can do a ROLLBACK to any of the savepoints within a transaction and then continue execution of SQL statements from that point.

  Note 

Although the SAVE TRANSACTION statement lets you name savepoints in a transaction, it does not COMMIT the work performed up to the savepoint. As such, the DBMS treats the entire sequence of statements following the BEGIN TRANSACTION statement as a single transaction. Therefore, if the program sending the statements aborts, or in the event of some other type of hardware or software failure, the DBMS will back out all statements within the transactions—including those before and after any savepoints.

Using Named and Nested Transactions on MS SQL Server

In Tip 131, "Understanding the MS-SQL Server Transaction Model," you learned how to use the SAVE TRANSACTION statement to create savepoints that let you roll back (undo) a portion of the work performed within a transaction. MS-SQL Server's capability to name and COMMIT named transactions serves an entirely different purpose. Named transactions and named commits should be used only to support transaction processing in stored procedures that may be called either as part of a transaction or outside of a transaction. The distinction of whether the stored procedure is called from within or outside of a transaction is important to the effect of named COMMIT executed statements within the stored procedure.

Suppose, for example, that you have a stored procedure created with

CREATE PROCEDURE SP_Insert_Row @row_number SMALLINT, @description CHAR(35) AS BEGIN TRANSACTION insert_trans INSERT INTO trans_table VALUES (@row_number, @description) COMMIT TRANSACTION insert_trans

that starts a transaction, inserts the values passed into the TRANS_TABLE, and then makes the insertion permanent by ending the transaction by executing a named COMMIT statement.

If you execute the SQL statements

CREATE TABLE trans_table (row_number SMALLINT, description VARCHAR(35)) BEGIN TRANSACTION INSERT INTO trans_table VALUES (1, 'Inserted Row 1') INSERT INTO trans_table VALUES (2, 'Inserted Row 2') EXEC SP_Insert_Row 3, 'Inserted Row 3' EXEC SP_Insert_Row 4, 'Inserted Row 4' ROLLBACK

the TRANS_TABLE will end up with no rows!

You might expect the rows inserted by the stored procedure to remain after the ROLLBACK, since the stored procedure executes the named COMMIT statement

COMMIT TRANSACTION insert_trans

after each insertion. However, in the current example, the stored procedure is called from within another transaction, making the transaction in the stored procedure a nested or inner transaction. MS-SQL Server ignores named COMMIT statements executed on nested (inner) transactions.

Thus, when called from within another transaction, the final disposition of the work performed by the nested (inner) transaction in the procedure is determined by what happens to the outer transaction. In the current example, a ROLLBACK statement terminates the outer transaction. As a result, all work performed within the outer transaction, including the work performed by the executed procedure, is removed by the ROLLBACK.

On the other hand, if you call the same procedure outside a transaction, such as

EXEC SP_Insert_Row 3, 'Inserted Row 3' BEGIN TRANSACTION INSERT INTO trans_table VALUES (1, 'Inserted Row 1') INSERT INTO trans_table VALUES (2, 'Inserted Row 2') EXEC SP_Insert_Row 4, 'Inserted Row 4' ROLLBACK

the TRANS_TABLE will contain:

row_number description ---------- -------------- 3 Inserted Row 3

In this example, the work performed by the first call to the procedure

EXEC SP_Insert_Row 3, 'Inserted Row 3'

is permanently written to the TRANS_TABLE by the COMMIT statement at the end of the transaction. When the procedure is executed outside of a transaction, the transaction within the procedure is not nested, and its COMMIT statement makes permanent changes to the TRANS_TABLE.

Conversely, the work performed by the second execution of the stored procedure in the example is again undone by the ROLLBACK. When the procedure is executed within a transaction, the transaction within the procedure is nested, and the DBMS does not make permanent changes to the TRANS_TABLE when it sees the COMMIT statement (since a COMMIT executed on an inner transaction is ignored by MS-SQL Server).

The important thing to know now is that MS-SQL Server ignores named COMMIT statements executed on inner, or nested, procedures. This is true whether the nested transaction occurs within a stored procedure or within the application program itself.

  Note 

The importance of using the named BEGIN TRANSACTION and named COMMIT TRANSACTION statements in the stored procedure may not, at first, be obvious. If you do not name the transaction in the stored procedure and end it with an unnamed COMMIT statement, the COMMIT TRANSACTION at the end of the procedure will COMMIT (make permanent) not only the work within the procedure but also end the outer transaction. As such, an unnamed COMMIT TRANSACTION statement at the end of the stored procedure SP_INSERT_ROW in either of the examples in this tip would have terminated both the inner and outer transactions and prevented the ROLLBACK at the end of the examples from undoing any work performed by making the INSERTs permanent prior to the ROLLBACK.

Understanding Transaction Logs

The transaction log, created automatically when you create a database, makes it possible for the DBMS to undo work performed by statements in a transaction and to restore the database to a consistent state after recovery from a hardware or software failure. At first, undoing changes or making large amounts of deleted data "reappear" may seem like magic. However, the DBMS does it not with smoke and mirrors, but by maintaining copies of the data before and after changes in its transaction log.

Although the actual specifics of transaction log maintenance vary among the DBMS products, the basic concept is the same for all of them. Whenever the DBMS executes an SQL statement that updates the database, the DBMS adds a record to the transaction log. Each record in the transaction log contains two copies of every row being changed, removed, or inserted. One copy shows the table row before the change, and one copy shows the table row after the change. Maintenance of the transaction log is such an important and integral part of statement execution that the DBMS updates the actual table data only after it has finished updating the log.

In addition to storing before and after pictures of the table rows, the DBMS also notes transaction events (BEGIN TRANSACTION, SAVE TRANSACTION, and COMMIT statement execution) in the transaction log. Using transaction event markers in conjunction with premodification images of rows being modified, the DBMS is able to back out the work of partially executed transactions or satisfy a user's request to do a ROLLBACK (undo) of database updates made after a certain point.

To do a ROLLBACK of an uncommitted transaction, for example, the DBMS simply works its way backward through the transaction log, replacing the current table data with the before update image of the data until it reaches a BEGIN TRANSACTION marker. Restoring the table rows to their state prior to changes made by statements in the transaction effectively backs out (undoes) all work performed by the transaction. On DBMS products that support savepoints, the DBMS can undo a portion of the work performed in a transaction by stopping at a user-specified savepoint marker vs. restoring original data up to the BEGIN TRANSACTION marker.

If the DBMS needs to restore the database to a consistent state due to a failure in executing a statement in a transaction or after the DBMS is restarted due to an abnormal system shutdown, the transaction log makes it possible. In both cases, the DBMS reads backward through the transaction log, undoing the work performed by uncommitted transactions and substituting the before image of each table row for the current table row until the DBMS reaches the BEGIN TRANSACTION marker.

When there are no updates that can be undone, as is the case when all transactions have been committed, the DBMS finds a COMMIT marker at the end of the transaction log. Thus, if the COMMIT marker is the first thing the DBMS finds when reading backward through the transaction log on startup or when it receives a ROLLBACK statement, it makes no changes to the database since there are no uncommitted changes it can (or needs to) undo.

  Note 

All updates noted in the transaction log are a part of a transaction. During an interactive session in auto-commit mode, the DBMS automatically writes a BEGIN TRANSACTION marker to the log at the start of statement execution, stores before and after pictures of each table row changed, and then automatically adds a COMMIT (end transaction) marker to the log after it makes the last change to the database. When it executes an explicit BEGIN TRANSACTION statement, the DBMS does not add the COMMIT marker to the transaction log until the application program interacting with the DBMS ends normally or until the program or user executes a COMMIT statement.

Категории