Understanding DB2: Learning Visually with Examples (2nd Edition)

2017-07-07 02:10:07

7.8. Tables

A table is an unordered set of records, consisting of rows and columns. Each column has a defined data type, and each row represents an entry in the table. Figure 7.9 shows an example of a table with n rows and m columns. The sales_person column with a VARCHAR data type is the first column in the table, followed by the region column with a CHAR data type. The year column is the mth column in the table and has an INTEGER data type.

Figure 7.9. An example of a table

7.8.1. Table Classification

Tables in DB2 can be classified as illustrated in Figure 7.10. You will learn more about each of these tables in the next sections.

Figure 7.10. Classification of tables in DB2

7.8.2. System Catalog Tables

DB2 automatically creates system catalog tables when a database is created. They always reside in the SYSCATSPACE table space. System catalog tables contain information about all the database objects in the database. For example, when you create a table space, its information will be loaded into one or more system catalog tables. When this table space is referenced during a later operation, DB2 checks the corresponding system catalog tables to see whether the table space exists and whether the operation is allowed. Without the system catalog tables, DB2 will not be able to function.

Some of the information contained in system catalog tables includes the following:

Definitions of all database objects

Column data types of tables and views

Defined constraints

Object privileges

Object dependencies

System catalog tables or views use the SYSIBM, SYSCAT, or SYSSTAT schemas.

The SYSIBM schema is used for the base system catalog tables.

The SYSCAT schema is used for views defined on the system catalog tables. DB2 users should normally query the SYSCAT views rather than the SYSIBM tables for information.

The SYSSTAT schema is used for views containing information about database statistics and is also based on the system catalog tables.

Although you cannot update the tables and views residing under the SYSIBM and SYSCAT schemas, you can update the views under the SYSSTAT schema. Updating these views can sometimes influence the DB2 optimizer to choose a specific access path.

Refer to Appendix D, Using the DB2 System Catalog Tables, for details about the system catalog tables.

7.8.3. User Tables

User tables are used to store a user's data. A user can create, alter, drop, and manipulate user tables.

To create a user table, use the CREATE TABLE statement. You can specify the following:

The name of the table

The columns of the table and their data types

The table spaces where you want the table, index, and long objects to be stored within the database

The constraints you want DB2 to build and maintain on the table, such as referential constraints and unique constraints

The following example illustrates the creation of the table myemployees with four columns.

CREATE TABLE myemployees ( empID INT NOT NULL PRIMARY KEY, empname VARCHAR(30) NOT NULL, mngrID INT NOT NULL, history CLOB)

In which table space would the table myemployees be created? In cases where a table space is not specified, as in this example, follow the flow chart shown in Figure 7.11 to determine what table space would be used.

Figure 7.11. Guidelines for determining how the default table space is chosen for a table

This next example uses the same CREATE TABLE situation, but it indicates the table spaces to be used for the table data, index, and long objects.

CREATE TABLE myemployees ( empID INT NOT NULL PRIMARY KEY, empname VARCHAR(30) NOT NULL, mngrID INT NOT NULL, history CLOB) IN datadms INDEX IN indexdms LONG IN largedms

Use the IN clause to specify the table space where the table data will reside. Use the INDEX IN clause to specify where all indexes for the table will reside. Use the LONG IN clause to indicate where the LOB, LONG VARCHAR, or LONG VARGRAPHIC objects will reside.

NOTE

If different table spaces are used for the table, index, and long data, all of these table spaces must be DMS. In addition, the table space where the long data is to be stored must be defined as a large table space.

NOTE

Prior to Version 8, large table spaces were known as long table spaces. Though the syntax of the CREATE TABLESPACE statement uses the LARGE clause, the syntax of the CREATE TABLE statement still uses LONG.

Figure 7.12 shows the command used to create the table myemployees and also the corresponding table space commands to create the required table spaces. Note that the third statement creates a large table space.

Figure 7.12. Creating a table where table, index, and long data are stored in different DMS table spaces

You can also create a table based on the definition of another table, for example:

CREATE TABLE clone LIKE myemployees

The table clone will have the same definition as the table myemployees, however, other objects like constraints, indexes, or triggers associated to the table are not copied. Table data is not copied either.

Another alternative is to create the table structure based on the result of a query, as shown next:

CREATE TABLE clone2 AS (SELECT * FROM myemployees) DEFINITION ONLY

The DEFINITION ONLY clause is required so that only the structure of the table is copied; otherwise, you would be creating a materialized query table (MQT), which is described in section 7.8.10, Materialized Query Tables and Summary Tables.

Once you have created a table, you cannot change the column names or data types; however, you are allowed to increase the length of VARCHAR columns or add new columns to the end of the table. You can do this with the ALTER TABLE statement. For example, to add the column address to the table myemployees, use this statement:

ALTER TABLE myemployees ADD COLUMN address CHAR(45)

You cannot remove a column from a table using the ALTER TABLE statement. If you want to remove a column from a table, you have two choices:

Use a view to hide the column you want removed. (Views are discussed in section 7.11.)

Drop the table and recreate it.

To drop a table and all its contents, use the DROP TABLE statement, for example:

DROP TABLE myemployees

7.8.4. Default Values

In the CREATE TABLE statement, you can use the DEFAULT clause for a given column to provide a default value for the column. This means that when you use an INSERT statement to insert a row that does not provide a value for the column, the default value specified in the DEFAULT clause will be used. For example, let's say you create the table company with this statement:

CREATE TABLE company ( companyID INTEGER, companyName VARCHAR(30), city VARCHAR(20) DEFAULT 'TORONTO' )

Inserting a record with either of the following two statements provides the same result.

(1) INSERT INTO company (companyID, companyName, city)

VALUES ( 111 , 'cityOne' , DEFAULT)

(2) INSERT INTO company (companyID, companyName)

VALUES ( 111, 'cityOne' )

The following row would be inserted.

COMPANYID COMPANYNAME CITY ----------- ------------------------------ -------------------- 111 cityOne TORONTO

In the first INSERT statement, the DEFAULT keyword is used. In the second INSERT statement, the third column (city) is not included in the statement. In both cases, this means that the default value as defined in the table is inserted for that column.

What about the table columns that do not have a DEFAULT clause? What is inserted when test columns are omitted from the INSERT statement? In such scenarios, DB2 will insert a NULL, assuming the column accepts NULL values. If the column does not accept NULL values, you will receive an error. (We describe NULLs in the next section.) For example, the result of this statement:

INSERT INTO company (city) VALUES ('ATLANTA')

is:

COMPANYID COMPANYNAME CITY ----------- ------------------------------ -------------------- - - ATLANTA

The dash (-) represents a NULL value.

The columns of a table can also be defined with the DEFAULT keyword just by itself. In such a scenario, DB2 will use default values depending on the data type of the column. Typically, DB2 chooses a zero for numeric data types and a blank for character strings. For example, let's recreate the table company as follows:

CREATE TABLE company ( companyID INTEGER DEFAULT, companyName VARCHAR(30) DEFAULT, city VARCHAR(20) DEFAULT 'TORONTO' )

Issuing the following statement:

INSERT INTO company (city) VALUES (DEFAULT)

returns:

COMPANYID COMPANYNAME CITY ----------- ------------------------------ ------------ 0 TORONTO

This example shows that because the columns companyID and companyName are both defined with the DEFAULT clause just by itself, DB2 chose a default value of zero for column companyID, which is an INTEGER, and a blank for column companyName, which is a VARCHAR.

7.8.5. Using NULL Values

NULL values represent an unknown state. For example, let's review the contents of the table student, which contains NULL values.

NAME MARK -------------------- ----------- Peter 100 Mary 60 John - Raul 80 Tom -

John and Tom were sick the day of the exam, therefore the teacher put NULL values for their marks. This is different than giving them a mark of zero. If you issue this statement:

SELECT avg(mark) as average FROM student

The result is:

AVERAGE ----------- 80

Note that the average was calculated as follows: (100 + 60 + 80) / 3. The total number of students considered in the calculation was three, not five, because NULL values were not taken into consideration in the calculation.

Your business requirements dictate when NULL values are allowed in your columns. Let's review another example to illustrate when using NOT NULL is appropriate. The following statement creates a table that stores a company phone directory.

CREATE TABLE telephoneDirectory ( empID CHAR(3) NOT NULL PRIMARY KEY, phone_no VARCHAR(15) NOT NULL, deptname VARCHAR(20) NOT NULL DEFAULT 'Marketing', position VARCHAR(30) DEFAULT 'Clerk' )

In the example, let's assume the business requirements indicate that the column empID must uniquely identify a row. Thus, empID should be created as NOT NULL so that NULL values are not accepted; otherwise, several rows may have NULLs, which would not make the rows unique.

Next, the column phone_no is also defined as NOT NULL per the business requirements. If the purpose of this table is to store telephone numbers, it's understandable that this column does not accept NULLs.

The third column, deptname, is defined as NOT NULL with a DEFAULT value of Marketing. This means that a NULL value is not accepted, and when the column is omitted in an INSERT statement, the default value of Marketing is used. For example, if you issue this statement:

INSERT INTO telephoneDirectory (empID, phone_no) VALUES ('111', '905-123-4567')

The result is:

EMPID PHONE_NO DEPTNAME POSITION ----- --------------- -------------------- ------------------------ 111 905-123-4567 Marketing Clerk

The fourth column, position, allows NULL values and has a default value of Clerk. This case was explained in section 7.8.4, Default Values. The NOT NULL DEFAULT value clause works the same as the DEFAULT value clause only that NULL values are not allowed.

7.8.6. Identity Columns

An identity column is a numeric column in a table that automatically generates a unique numeric value in sequence for each row inserted. A unique identifier is often used in applications to identify a specific row. Unlike sequence objects, which we discuss in section 7.16, Sequences, identity columns are bound to the table they are defined on. There can be only one identity column per table. DB2 can generate the identity column values in two ways.

Generated always: The values are always generated by DB2. Applications are not allowed to provide an explicit value.

Generated by default: The values can be explicitly provided by an application; if no value is given, DB2 generates one. In this case, however, DB2 cannot guarantee the uniqueness of the value generated.

To create an identity column, use the CREATE TABLE statement with the GENERATED clause and make sure it contains the IDENTITY keyword because GENERATED can also be used to generate other values automatically that are not identity columns. Here is an example.

CREATE TABLE product ( productno INTEGER GENERATED ALWAYS AS IDENTITY (START WITH 200 INCREMENT BY 1), description VARCHAR(50) )

The column productno is an INTEGER defined as an identity column that is always generated. The value generated will start from 200, and it will be incremented by 1. Let's perform a few INSERT statements and see the results obtained.

INSERT INTO product VALUES (DEFAULT,'banana'); --->inserts 200,bananaINSERT INTO product (description) VALUES ('apple'); --->inserts 201,appleINSERT INTO product VALUES (300,'pear'); --->error SQL0798N

COMMIT;

INSERT INTO product (description) VALUES ('orange'); --->inserts 202,orange

ROLLBACK;

INSERT INTO product (description) VALUES ('plum'); --->inserts 203,plum

COMMIT;

The following query shows the final result.

SELECT * FROM product; PRODUCTNO DESCRIPTION ----------- ------------ 200 banana 201 apple 203 plum

The first two INSERT statements show that two identity column values were generated: 200 and 201. The third INSERT statement returns an error because you cannot explicitly insert a value for an identity column generated as ALWAYS. After the third INSERT statement, we issue a COMMIT to guarantee these rows are stored in the database. The fourth INSERT statement causes another identity column value, 202, to be generated; however, we issue a ROLLBACK statement right after, so this row is not stored in the database. Note that the final INSERT statement, which inserts the product plum, generates a value of 203, not 202. (COMMIT and ROLLBACK statements are explained in more detail in Chapter 13, Developing Database Backup and Recovery Solutions.)

NOTE

An identity column value is generated only once. Once the value has been generated, even if a ROLLBACK statement is performed, it will not be generated again.

Now let's review another example, this time creating the same table product with the GENERATED BY DEFAULT clause.

CREATE TABLE product ( productno INTEGER GENERATED BY DEFAULT AS IDENTITY (START WITH 200 INCREMENT BY 1), description VARCHAR(50) )

Next, we insert a few rows.

INSERT INTO product VALUES (DEFAULT,'banana'); --->inserts 200,bananaINSERT INTO product (description) VALUES ('apple'); --->inserts 201,apple

INSERT INTO product VALUES (300,'pear'); --->inserts 300,pear

INSERT INTO product VALUES (201,'orange'); --->inserts 201,orange

COMMIT;

INSERT INTO product (description) VALUES ('papaya'); --->inserts 202,papayaROLLBACK;

INSERT INTO product (description) VALUES ('plum'); --->inserts 203,plum

COMMIT;

The following query shows the final result.

SELECT * FROM product PRODUCTNO DESCRIPTION ----------- --------------------- 200 banana 201 apple 300 pear 201 orange 203 plum

The first two INSERT statements show that two identity column values were generated: 200 and 201. For the third and fourth INSERT statements, we explicitly provided the values 300 and 201, respectively, for the identity column. Note that DB2 did not return an error as in the previous example because we defined the identity column as GENERATED BY DEFAULT. After the fourth INSERT statement, we issue a COMMIT to guarantee these rows are stored in the database. The fifth INSERT statement causes another identity column value, 202, to be generated; however, we issue a ROLLBACK statement right after, so this row is not stored in the database. Note that the final INSERT statement, which inserts the product plum, generates a value of 203, not 202.

The following final example illustrates a GENERATED value, which is not an identity column. The example uses GENERATED ALWAYS, but you can also use GENERATED BY DEFAULT.

CREATE TABLE income ( empno INTEGER, salary INTEGER, taxRate DECIMAL(5,2), netSalary DECIMAL(7,2) GENERATED ALWAYS AS (salary * (1 - taxRate)) )

If you insert the following row:

INSERT INTO income (empno, salary, taxRate) VALUES (111, 50000, 0.3)

The result is:

EMPNO SALARY TAXRATE NETSALARY ----------- ----------- ------- --------- 111 50000 0.30 35000.00

DB2 generates the value of the last column NETSALARY based on the SALARY and TAXRATE columns.

7.8.7. Constraints

Constraints allow you to create rules for the data in your tables. You can define four types of constraints on a table.

A unique constraint ensures that no duplicate key values can be entered in the table.

A referential constraint ensures that a value in one table must have a corresponding entry in a related table.

A check constraint ensures that the values you enter into the column are within the rules specified when the table was defined.

An informational constraint allows you to enforce or not enforce a constraint.

These constraints are discussed further in the following sections.

7.8.7.1 Unique Constraints

A unique constraint indicates that the values for a given column must all be unique. A unique constraint is defined in the CREATE TABLE or ALTER TABLE statements using the UNIQUE clause or the PRIMARY KEY clause. A primary key, as you will see in the next section, is also a unique constraint.

All the columns that make up a unique constraint must be defined as NOT NULL. For the following example, the column empID must be defined as NOT NULL because it is the primary key. The column deptID must also be defined as NOT NULL because it is a unique constraint.

CREATE TABLE employ ( empID INT NOT NULL PRIMARY KEY, name CHAR(30) , deptID INT NOT NULL UNIQUE )

Now, let's perform a few INSERT statements in sequence.

INSERT INTO employ VALUES (111, 'Peter', 999) ---> inserts 111, Peter, 999

INSERT INTO employ VALUES (111, 'Peter', 123) ---> SQL0803N error, duplicate primary key 111

INSERT INTO employ VALUES (789, 'Peter', 999) ---> SQL0803N error, duplicate unique key 999

This example illustrates that an error (SQL0803N) occurs if the value you attempt to insert for a unique or primary key column is not unique (it already exists in the table).

Unique constraints are implemented using unique indexes. When a CREATE TABLE statement has the UNIQUE or PRIMARY KEY keywords, DB2 automatically creates a corresponding unique index. The name of this system-generated index starts with "SQL" followed by a timestamp. For the example just shown, two unique indexes were generated with these names:

SQL040422135806320 SQL040422135806460

Both indexes were created on April 22, 2004, at 1:58 p.m.

Though you would normally not refer to an index name directly in an application, a good index name may be helpful when analyzing an explain output. An explain output, as you will see in Chapter 16, Database Performance Considerations, displays the access path DB2 chooses to access your data for a given query. Therefore, rather than letting DB2 generate system names for your indexes, we recommend using the ALTER TABLE statement in the case of primary key columns and the CONSTRAINT clause to explicitly give names to the indexes. For example, let's rewrite the CREATE TABLE statement used in the previous example as follows:

CREATE TABLE employ ( empID INT NOT NULL, name CHAR(30) , deptID INT NOT NULL CONSTRAINT unique_dept_const UNIQUE ) ALTER TABLE employ ADD CONSTRAINT employ_pk PRIMARY KEY (empID)

In this example, we removed the PRIMARY KEY clause of the CREATE TABLE statement and added an ALTER TABLE statement. The ALTER TABLE statement allowed us to put in a name for the constraint (employ_pk), which also becomes the name of the corresponding unique index.

Instead of the ALTER TABLE statement, you can also use the following two statements with the same result:

CREATE UNIQUE INDEX employ_pk ON employ (empID) ALTER TABLE employ ADD PRIMARY KEY (empID)

In this case, the CREATE UNIQUE statement explicitly creates the unique index and specifies the desired name for the index. Next, the ALTER TABLE statement indicates that the same column used for the unique index is also used as the primary key. After executing the ALTER TABLE statement, you will receive this warning message:

SQL0598W Existing index "EMPLOY_PK" is used as the index for the primary key or a unique key. SQLSTATE=01550

This warning is acceptable because this is in fact what is desired.

In the previous CREATE TABLE statement, we also added a unique constraint using the clause CONSTRAINT unique_dept_const UNIQUE. With this clause, DB2 generates a corresponding unique index with the name unique_dept_const.

You can also use the ALTER TABLE statement to add a unique constraint, as shown in this example:

ALTER TABLE employ ADD CONSTRAINT unique_dept_const UNIQUE (deptID)

7.8.7.2 Referential Constraints

Referential constraints are used to support referential integrity. Referential integrity allows your database to manage relationships between tables.

7.8.7.2.1 Using Primary, Unique, and Foreign Keys to Establish Referential Integrity

Referential integrity can be better explained with examples. Assume you have two tables, as illustrated in Figure 7.13.

Figure 7.13. Referential integrity between two tables

The figure shows the tables country and city, where country is the parent table containing information about all the countries in the world, and city is the dependent table containing information about a particular city for a given country. Note that the column country_ID and the column country_no are used to establish a relationship between the two tables.

The country_ID column is a primary key column. A primary key consists of one or more columns; it is a special case of a unique constraint. While there can be many unique constraints in a table, there can be only one primary key. A primary key is used to establish a referential integrity relationship with another table.

The country_no column, known as the foreign key column, will reference the primary key column of the parent table. Because of this relationship, the country_no column cannot have a value that does not exist in the country_ID column. The data type for this column must be compatible with the primary key column of the parent table. For the example illustrated in Figure 7.13, if the parent key column is defined as type INTEGER, the foreign key column can be defined as type DECIMAL because it is a numeric data type for which conversion is allowed; however, it cannot be defined as type CHAR. Other than this restriction, the foreign key can be treated like any other column. It can use the NOT NULL, UNIQUE, and even PRIMARY KEY clauses.

To establish the referential integrity relationship between the two tables, let's look at the corresponding CREATE TABLE statements for both tables.

CREATE TABLE country ( country_ID INT NOT NULL PRIMARY KEY, country_Name VARCHAR(30) NOT NULL, continent_Name CHAR(15) ) CREATE TABLE city ( city_ID INT NOT NULL PRIMARY KEY, city_name VARCHAR(30) NOT NULL, country_no INT REFERENCES country, population INT )

Note that the CREATE TABLE statement for the city table includes the REFERENCES clause and that it does not need to specify any column of the parent table country. DB2 will automatically look for the primary key column of the parent table to establish the relationship.

What if there is no primary key column for the parent table but a unique constraint instead? What if the parent table contains more than one unique constraint? In such cases, use the REFERENCES clause followed by the correct column name(s). For example, let's say we actually created the country table in Figure 7.13 with no primary key but two unique constraints, as follows.

CREATE TABLE country ( country_ID INT NOT NULL UNIQUE, country_Name VARCHAR(30) NOT NULL, continent_Name CHAR(15) NOT NULL UNIQUE )

To establish referential integrity using the column country_ID, this column must be specified in the CREATE TABLE statement for the city table, as shown below.

CREATE TABLE city ( city_ID INT NOT NULL PRIMARY KEY, city_name VARCHAR(30) NOT NULL, country_no INT REFERENCES country(country_ID), population INT )

NOTE

A unique constraint on a column that has been defined as NOT NULL can also be referenced by a foreign key clause because a primary key is basically the same as a unique constraint.

You can also use the ALTER TABLE statement to add a foreign key, for example:

ALTER TABLE city ADD FOREIGN KEY (country_no) REFERENCES country (country_ID)

This statement would add to the table city the foreign key using column country_no, which would reference column country_ID in table country.

NOTE

Using primary keys, unique keys, and foreign keys is one method to implement referential integrity. Another method is to use triggers. By using triggers, you can code your own logic that may differ from the rules described in this section. Triggers are discussed in section 7.13.

7.8.7.2.2 Referential Integrity Implications on SQL Operations

The enforcement of referential integrity has implications on INSERT, UPDATE, and DELETE operations, which must follow certain rules. To explain these rules, let's look at the following example using Figure 7.14.

Figure 7.14. An example to illustrate SQL operations under referential integrity

We used the first two tables, country and city, in previous examples. In this particular example, we have inserted a few records in each of these tables. A new table, district, which is dependent on table city, is also illustrated. Here is the CREATE TABLE statement for the table district.

CREATE TABLE district ( district_ID INT NOT NULL PRIMARY KEY, district_name VARCHAR(30) NOT NULL, city_no INT REFERENCES city, registrations INT )

The following cases are examined.

Inserting to a Parent Table

What would happen if the following record were inserted in table country?

INSERT INTO country VALUES (2,'Spain',4)

Because country is the parent table at the top of Figure 7.14, any value can be inserted into this table without a need to worry about the dependent tables.

Inserting to a Dependent Table

What would happen if the following record were inserted in table city?

INSERT INTO city VALUES (44,'Vancouver',3,4000000)

Table city is dependent on table country based on column country_no (the third column in the city table). This INSERT statement is trying to insert a record with a value of 3 for the country_no column. From Figure 7.14 you can see this value is not present in table country; therefore, this record cannot be inserted and an error would be returned.

Deleting a Row from the Parent Table

What would happen if the following record were deleted from table country?

DELETE FROM country WHERE country_name = 'Canada'

This DELETE statement would fail with an error SQL0532N because there are related dependent rows. This is the default behavior, also called the NO ACTION delete rule.

You can specify DELETE rules in the CREATE TABLE statement of the dependent table. In addition to NO ACTION, you can use the following rules.

RESTRICT: The behavior is exactly the same as the NO ACTION rule. The difference is when this constraint is enforced. For more details, refer to the DB2 UDB SQL Reference manual.

CASCADE: All the dependent rows will be deleted when the parent table row is deleted.

SET NULL: All the dependent rows will have the value of the foreign key column set to NULL, if NULLs are allowed; otherwise, an error is returned. All the other columns remain unchanged.

For example, let's say we actually created the table city as follows.

CREATE TABLE city ( city_ID INT NOT NULL PRIMARY KEY, city_name VARCHAR(30) NOT NULL, country_no INT REFERENCES country(country_ID) ON DELETE CASCADE, population INT )

Note that we added the clause ON DELETE CASCADE to the foreign key column country_no.

If we execute the following statement again, will it work this time?

DELETE FROM country WHERE country_name = 'Canada'

The answer is no. Though we defined the CASCADE rule correctly in the city table, we did not define it in the district table. All dependent tables need to be defined using CASCADE if you want all the dependent rows to be deleted. In this example, if we had defined the district table correctly, all the rows of all the tables would have been deleted.

Deleting a Row from a Dependent Table

You can delete a row from a dependent table with no implications unless the dependent table is the parent table of another table.

Updating a Row from the Parent Table

You cannot update the primary key of the parent table. To ensure you don't duplicate an existing value, DB2 does not allow this operation.

Updating a Row from a Dependent Table

You can update the foreign key of a dependent table only if the new value already exists in the parent table and the foreign key is defined as NOT NULL. This is the default behavior, which corresponds to the NO ACTION update rule.

For example, issuing this statement:

UPDATE city SET country_no = 7 WHERE city_name = 'Montreal'

would return error SQL0530N, which indicates the value of 7 does not exist in the parent table.

The other UPDATE rule possible is RESTRICT, which behaves similarly to the NO ACTION rule. The difference is when the rule enforcement takes place. For details about this rule, please review the DB2 UDB SQL Reference manual.

You can specify UPDATE rules on the CREATE TABLE statement of a dependent table. For example, we could have created the city table as follows (on top of the DELETE rules of the previous example).

CREATE TABLE city ( city_ID INT NOT NULL PRIMARY KEY, city_name VARCHAR(30) NOT NULL, country_no INT REFERENCES country(country_ID) ON DELETE CASCADE ON UPDATE RESTRICT, population INT )

7.8.7.3 Check Constraints

Check constraints are used to enforce data integrity at the table level. Once the check constraint is defined, every INSERT or UPDATE operation must satisfy the constraint; otherwise, you will receive an error. For example, let's create the table student.

CREATE TABLE student ( student_ID INT NOT NULL PRIMARY KEY, name VARCHAR(30) NOT NULL, sex CHAR(1) NOT NULL CONSTRAINT sex_check_const CHECK (sex in ('M ', 'F ')) )

This table has the check constraint sex_check_const defined which verifies that the column sex has the values of M or F. Now let's attempt the following statement.

INSERT INTO student VALUES (1, 'Tom', 'Z')

We will receive an error SQL0545N because the value Z does not satisfy the check constraint.

You can also add a check constraint with the ALTER TABLE statement, as shown here.

ALTER TABLE student ADD CONSTRAINT sex_check_const CHECK (sex in ('M ', 'F '))

If you are adding a check constraint with the ALTER TABLE statement to a table that already has data, DB2 will check the entire table to make sure the existing data satisfies the check constraint. If it doesn't, the ALTER TABLE statement will fail with error SQL0544N.

If you do not want DB2 to check the table when a check constraint is added, you can use the SET INTEGRITY statement. This statement turns off check constraint and referential constraint checking. For example, let's say we create the student table without a check constraint and insert some rows that will later be invalid for the check constraint.

CREATE TABLE student ( student_ID INT NOT NULL PRIMARY KEY, name VARCHAR(30) NOT NULL, sex CHAR(1) NOT NULL ) INSERT INTO student VALUES (1, 'Tom', 'Z') INSERT INTO student VALUES (2, 'Mary', 'A')

Now we attempt to add the following check constraint.

ALTER TABLE student ADD CONSTRAINT sex_check_const CHECK (sex in ('M ', 'F '))

You receive error SQL0544N, as indicated earlier. Thus, use the SET INTEGRITY command to turn off constraint checking so that you can add the constraint.

SET INTEGRITY FOR student OFF

At this point, the student table is put in CHECK PENDING state, a state that allows only a few operations on the table, like ALTER TABLE. Other operations such as SELECT, INSERT, UPDATE, and DELETE are disallowed.

After turning off constraint checking, you can repeat the ALTER TABLE statement, which this time should be successful. Use the SET INTEGRITY statement again to turn constraint checking on as follows:

SET INTEGRITY FOR student CHECK IMMEDIATE UNCHECKED

The IMMEDIATE UNCHECKED option turns on check constraints again but does not check the existing table data. Alternatively, you can also issue:

SET INTEGRITY FOR student IMMEDIATE CHECKED

In this case, the IMMEDIATE CHECKED option turns on check constraints again and also checks the existing table data. If a violation is encountered, the table will remain in CHECK PENDING state. The SET INTEGRITY statement has an option to move the violating records to an exception table.

SET INTEGRITY FOR student IMMEDIATE CHECKED FOR EXCEPTION IN student USE my_exception_table

The name of the exception table in this example is my_exception_table. This table must exist with at least the same columns as the original source table, in this case, the student table. After this SET INTEGRITY statement is executed, the violating rows would be moved to the exception table, and the CHECK PENDING status would be removed. For more details about the SET INTEGRITY statement, refer to the DB2 UDB SQL Reference manual.

7.8.7.4 Informational Constraints

Prior to Version 8.1, DB2 always enforced constraints once you defined them. Though you can turn constraint checking off with the SET INTEGRITY statement, this is mainly used to perform table alterations to add new constraints to existing tables, as you saw in the previous section. Using the SET INTEGRITY statement puts your table in CHECK PENDING status, which prevents you from performing many operations on your table.

What if your application already performs constraint checking, and thus there is no need for DB2 to check the data again? For example, large applications such as SAP, PeopleSoft, and Siebel are written to check the constraints before they insert the data into DB2. In this case, defining the constraint in DB2 would cause extra overhead if DB2 is also enforcing the rule and revalidating the constraint. However, if you do not define these constraints, the DB2 optimizer cannot use them to its advantage in choosing the most optimal access plans. (Chapter 16, Database Performance Considerations, explains the DB2 optimizer in more detail.)

With Version 8.1, informational constraints were introduced. Informational constraints allow you to specify whether or not DB2 should enforce the constraint and whether or not it can be used by the optimizer to choose the best access plan for the application statements.

The default operation when you create a constraint is that it is always enforced and can be used by the optimizer. You can change this default behavior by using informational constraints, which are implemented by using the following clauses of the CREATE TABLE statement.

ENFORCED: This is the default option. Use this clause if you want DB2 to check the constraints for every operation on the table.

NOT ENFORCED: Use this clause if you do not want DB2 to check the constraints for every operation on the table.

ENABLE QUERY OPTIMIZATION: Use this clause so that DB2 can use the knowledge of the constraint when building the plan for accessing the table or referenced tables.

DISABLE QUERY OPTIMIZATION: Use this clause if you want the DB2 optimizer to ignore the constraints defined on your table.

The following example illustrates how informational constraints work.

CREATE TABLE student ( student_ID INT NOT NULL PRIMARY KEY, name VARCHAR(30) NOT NULL, sex CHAR(1) NOT NULL CONSTRAINT sex_check_const CHECK (sex in ('M ', 'F ')) NOT ENFORCED ENABLE QUERY OPTIMIZATION )

Note that the constraint for table student will not be enforced, but the constraint is used for query optimization. Now let's perform the following statements.

(1) INSERT INTO student VALUES (5, 'John', 'T')

(2) SELECT * FROM student WHERE sex = 'T'

The first statement executes successfullya T can be inserted for the sex column because the constraint sex_check_const is not enforced.

The second statement returns zero records because query optimization is enabled. Therefore, the optimizer does not scan the table but checks the constraints defined for the sex column in the DB2 catalog tables and assumes it has only values of M or F, quickly returning a result of zero records. Of course, this result is incorrect. If you want to obtain the correct result, disable query optimization. You can do this with the ALTER TABLE statement:

ALTER TABLE student ALTER CHECK sex_check_const DISABLE QUERY OPTIMIZATION

If you perform the second statement again, this time you should get one record.

SELECT * FROM student WHERE sex = 'T' STUDENT_ID NAME SEX ----------- ------------------------------ --- 5 John T

NOTE

After issuing the ALTER TABLE statement to enable or disable query optimization, make sure to issue a terminate command if working from the CLP so the change will take effect.

NOTE

Use informational constraints only if you are certain the data to be inserted or updated has been correctly checked by your application. Normally you want to use the options NOT ENFORCED and ENABLE QUERY OPTIMIZATION together because you want DB2 to reduce overhead by not performing constraint checking, but having the DB2 optimizer take into account the constraint definition.

7.8.8. Not Logged Initially Tables

The NOT LOGGED INITIALLY clause of the CREATE TABLE statement allows you to create a table that will not be logged when an INSERT, UPDATE, DELETE, CREATE INDEX, ALTER TABLE, or DROP INDEX operation is performed in the same unit of work in which the CREATE TABLE statement was issued. For example, let's say you execute the following statements in a script.

CREATE TABLE products ( productID INT, product_Name VARCHAR(30) ) NOT LOGGED INITIALLY; INSERT INTO products VALUES (1,'door'); INSERT INTO products VALUES (2,'window'); ... INSERT INTO products VALUES (999999,'telephone'); COMMIT; INSERT INTO products VALUES (1000000,'television'); UPDATE products SET product_name = 'radio' where productID = 3456; ALTER TABLE products ACTIVATE NOT LOGGED INITIALLY INSERT INTO products VALUES (1000001,'desk'); INSERT INTO products VALUES (1000002,'table'); ... INSERT INTO products VALUES (1999999,'chair'); COMMIT;

Any operation from the CREATE TABLE statement until the first COMMIT is not logged. Once the COMMIT is issued, any subsequent operation is logged. For this example, the INSERT and UPDATE statements after the first COMMIT are logged.

After creating the table as NOT LOGGED INITIALLY, if you would like to turn off logging temporarily again, you can use the ALTER TABLE statement with the ACTIVATE NOT LOGGED INITIALLY clause, as shown in the example. Any operations between the ALTER TABLE and the second COMMIT are not logged.

NOTE

You can use the statement ALTER TABLE table_name ACTIVATE NOT LOGGED INITIALLY only for tables that were originally created with the NOT LOGGED INITALLY clause.

You can also use the WITH EMPTY TABLE clause as part of the ALTER TABLE table_name ACTIVATE NOT LOGGED INITIALLY statement to remove all the data of the table. This method is faster than using a DELETE FROM table_name statement. For example, to remove all the rows of the table products, issue:

ALTER TABLE products ACTIVATE NOT LOGGED INITIALLY WITH EMPTY TABLE

7.8.9. Table Compression

You can compress tables to a certain extent by using the VALUE COMPRESSION clause of the CREATE TABLE statement. This clause tells DB2 that it can use a different internal format for the table rows so they occupy less space. In a sense, this clause turns on compression for the table; however, you need to specify another clause, COMPRESS SYSTEM DEFAULT, for each column that you want to compress. Only the columns whose values are normally NULL or the system default value of 0 can be compressed. Also, the data type must not be DATE, TIME, or TIMESTAMP. If the data type is a varying-length string, this clause is ignored. Here's an example:

CREATE TABLE company ( company_ID INTEGER NOT NULL PRIMARY KEY, name CHAR(10), address VARCHAR(30) COMPRESS SYSTEM DEFAULT, no_employees INTEGER NOT NULL COMPRESS SYSTEM DEFAULT ) VALUE COMPRESSION

The column address would be ignored since it's a VARCHAR column, and the column no_employees would be compressed. Table compression saves space especially for tables used in data warehousing applications where many rows contain NULLs or the system default value of 0. However, UPDATE operations may be impacted when changing to a different value than the default of 0 because the compressed value would first have to be expanded and then updated.

For an existing table containing data, you can enable table compression using the ALTER TABLE statement, as shown in this example.

ALTER TABLE city ACTIVATE VALUE COMPRESSION ALTER TABLE city ALTER COLUMN population COMPRESS SYSTEM DEFAULT

In this example, we enable compression by using the first statement, and then we specify which column to compress by using the second statement. In addition, if the table city were populated, the REORG utility would have to be executed on the table for the compression to take effect on the existing rows. Chapter 12, Maintaining Data, discusses the REORG utility in more detail.

7.8.10. Materialized Query Tables and Summary Tables

Materialized query tables (MQTs) allow users to create tables with data based on the results of a query. The DB2 optimizer can later use these tables to determine whether a query can best be served by accessing an MQT instead of the base tables. Here is an example of an MQT:

CREATE SUMMARY TABLE my_summary AS (SELECT city_name, population FROM country A, city B WHERE A.country_id = B.country_no) DATA INITIALLY DEFERRED REFRESH DEFERRED

The SUMMARY keyword is optional. The DATA INITIALLY DEFERRED clause indicates that DB2 will not immediately populate the my_summary MQT table after creation, but following the REFRESH TABLE statement:

REFRESH TABLE my_summary

The REFRESH DEFERRED clause in the CREATE SUMMARY TABLE statement indicates that the data in the table is refreshed only when you explicitly issue a REFRESH TABLE statement. Alternatively, you can create the MQT with the REFRESH IMMEDIATE clause, which means DB2 immediately refreshes the data when the base tables are changed.

DB2 checks the registry variable CURRENT REFRESH AGE to determine whether or not the MQT contains up-to-date information. This registry can have a value from 0 up to 99999999999999 (9,999 years, 99 months, 99 days, 99 hours, 99 minutes, and 99 seconds), which indicates the maximum duration the DB2 optimizer can wait since the last REFRESH TABLE statement was issued on an MQT to consider MQT tables in its calculations. For example, if an MQT were refreshed today, and the CURRENT REFRESH AGE has a value of 5 days, the DB2 optimizer can consider the MQT for its calculations for the next 5 days. If the value of this register is 0, only the tables created with the REFRESH IMMEDIATE clause can be used for optimization.

Prior to Version 8, MQTs were known as automatic summary tables (ASTs). With Version 8, ASTs are considered a special case of MQTs whose fullselect contains a GROUP BY clause summarizing data from the tables referenced in the fullselect.

7.8.11. Temporary Tables

Temporary tables can be classified as system or user tables. DB2 manages system temporary tables in the system temporary table space. DB2 creates and drops these tables automatically. Since users don't have control over system temporary tables, we don't discuss them any further in this section.

You create user temporary tables inside a user temporary table space. For example, the following statement creates a user temporary table space called usrtmp4k.

CREATE USER TEMPORARY TABLESPACE usrtmp4k MANAGED BY SYSTEM USING ('C:\usrtmp')

User temporary tables, referred to as temporary tables from here on, store temporary data, that is, data that will be destroyed after a session or when a connection ends. Temporary tables are typically used in situations where you need to compute a large result set from an operation, and you need to store the result set temporarily to continue with further processing.

Though transaction logging is allowed with temporary tables, most users don't need to log temporary data. In fact, not having transaction logging for this type of table improves performance.

Temporary tables exist only for one connection; therefore, there are no concurrency or locking issues.

To create a temporary table, use the DECLARE statement. Here's an example.

DECLARE GLOBAL TEMPORARY TABLE temp_table1 (col1 int, col2 int) ON COMMIT PRESERVE ROWS NOT LOGGED IN usrtmp4k

Table temp_table1 is created in usrtmp4k, the user temporary table space we created earlier.

DB2 uses the schema session for all temporary tables regardless of the user ID connected to the database. After you create a temporary table, you can access it just like any regular table. The following statement inserts a row into table temp_table1.

INSERT INTO session.temp_table1 (1,2)

The following statement selects all the rows in table temp_table1:

SELECT * FROM session.temp_table1

You can drop and alter temporary tables, but you cannot create views or triggers against them. Indexes are allowed.

NOTE

When working with temporary tables, make sure to explicitly specify the schema session. If you work with objects without specifying the schema, DB2 defaults to the authorization ID or connection ID.

7.8. Tables

Figure 7.9. An example of a table

7.8.1. Table Classification

Figure 7.10. Classification of tables in DB2

7.8.2. System Catalog Tables

7.8.3. User Tables

Figure 7.11. Guidelines for determining how the default table space is chosen for a table

Figure 7.12. Creating a table where table, index, and long data are stored in different DMS table spaces

7.8.4. Default Values

7.8.5. Using NULL Values

7.8.6. Identity Columns

7.8.7. Constraints

7.8.7.1 Unique Constraints

7.8.7.2 Referential Constraints

7.8.7.2.1 Using Primary, Unique, and Foreign Keys to Establish Referential Integrity

Figure 7.13. Referential integrity between two tables

7.8.7.2.2 Referential Integrity Implications on SQL Operations

Figure 7.14. An example to illustrate SQL operations under referential integrity

Inserting to a Parent Table

Inserting to a Dependent Table

Deleting a Row from the Parent Table

Deleting a Row from a Dependent Table

Updating a Row from the Parent Table

Updating a Row from a Dependent Table

7.8.7.3 Check Constraints

7.8.7.4 Informational Constraints

7.8.8. Not Logged Initially Tables

7.8.9. Table Compression

7.8.10. Materialized Query Tables and Summary Tables

7.8.11. Temporary Tables

Категории