Inside Microsoft SQL Server 2005: The Storage Engine (Solid Quality Learning)
SQL Server 2005 allows existing tables to be modified in several ways. Using the ALTER TABLE command, you can make the following types of changes to an existing table:
Changing a Data Type
By using the ALTER COLUMN clause of ALTER TABLE, you can modify the data type or NULL property of an existing column. But be aware of the following restrictions:
Here's the syntax and an example of using the ALTER COLUMN clause of the ALTER TABLE statement: SYNTAX ALTER TABLE table-name ALTER COLUMN column-name { type_name [( prec [, scale] ) ] [COLLATE <collation name> ] [ NULL | NOT NULL ] | {ADD | DROP} {ROWGUIDCOL | PERSISTED} } EXAMPLE /* Change the length of the emp_lname column in the employee table from varchar(15) to varchar(30) */ ALTER TABLE employee ALTER COLUMN emp_name varchar(30)
Adding a New Column
You can add a new column, with or without specifying column-level constraints. You can add only one column for each ALTER TABLE statement. If the new column doesn't allow NULLs and isn't an identity column, the new column must have a default constraint defined. SQL Server populates the new column in every row with a NULL, the appropriate identity value, or the specified default. If the newly added column is nullable and has a default constraint, the existing rows of the table are not filled with the default value, but rather with NULL values. You can override this restriction by using the WITH VALUES clause so that the existing rows of the table are filled with the specified default value. Adding, Dropping, Disabling, or Enabling a Constraint
You can use ALTER TABLE to add, drop, enable, or disable a constraint. The trickiest part of using ALTER TABLE to manipulate constraints is that the word CHECK can be used in three different ways:
ALTER TABLE employee CHECK CONSTRAINT ALL The only types of constraints that can be disabled are CHECK constraints and FOREIGN KEY constraints, and disabling tells SQL Server not to validate new data as it is added or updated. You should use caution when disabling and re-enabling constraints. If a constraint was part of the table when the table was created or was added to the table using the WITH CHECK option, SQL Server knows the data conforms to the data integrity requirements of the constraint. The SQL Server optimizer can then take advantage of this knowledge in some cases. For example, if you have a constraint that requires col1 to be greater than 0, and then an application submits a query looking for all rows where col1 < 0, if the constraint has always been in effect, the optimizer will know that no rows can satisfy this query and the plan is a very simple plan. However, if the constraint has been disabled and re-enabled without using the WITH CHECK option, there is no guarantee that some of the data in the table won't meet the integrity requirements. You might not have any data less than or equal to 0, but the SQL Server optimizer will not know that when it is devising the plan. The catalog views sys.check_constraints and sys.foreign_keys each have a column called is_not_trusted. If you re-enable a constraint and don't use the WITH CHECK option to tell SQL Server to revalidate all existing data, the is_not_trusted column will be set to 1. Although you cannot use ALTER TABLE to disable or enable a PRIMARY KEY or UNIQUE constraint, you can use the ALTER INDEX command to disable the associated index. I'll discuss ALTER INDEX in Chapter 7. You can use ALTER TABLE to drop PRIMARY KEY and UNIQUE constraints, but you need to be aware that dropping one of these constraints automatically drops the associated index. In fact, the only way to drop those indexes is by altering the table to remove the constraint. Note
Dropping a Column
You can use ALTER TABLE to remove one or more columns from a table. However, you can't drop the following columns:
You can drop a column using the following syntax: ALTER TABLE table-name DROP COLUMN column-name [, next-column-name]...
Note
Enabling or Disabling a Trigger
You can enable or disable one or more (or all) triggers on a table using the ALTER TABLE command. Triggers are discussed in Inside Microsoft SQL Server 2005: T-SQL Programming. Internals of Altering Tables
Note that not all the ALTER TABLE variations require SQL Server to change every row when the ALTER TABLE is issued. SQL Server can carry out an ALTER TABLE command in three basic ways:
In many cases, SQL Server can just change the metadata (primarily the data seen through sys.columns) to reflect the new structure. In particular, the data isn't touched when a column is dropped, when a new column is added and NULL is assumed as the new value for all rows, when the length of a variable-length column is increased, or when a non-nullable column is changed to allow NULLs. The fact that data isn't touched when a column is dropped means that the disk space of the column is not reclaimed. You might have to reclaim the disk space of a dropped column when the row size of a table approaches or has exceeded its limit. You can reclaim space by creating a clustered index on the table or rebuilding an existing clustered index by using ALTER INDEX, as we'll see in Chapter 7. Some changes to a table's structure require that the data be examined but not modified. For example, when you change the nullability property to disallow NULLs, SQL Server must first make sure there are no NULLs in the existing rows. A variable-length column can be shortened when all the existing data is within the new limit, so the existing data must be checked. If any rows have data longer than the new limit specified in the ALTER TABLE, the command will fail. So you do need to be aware that for a huge table, this can take some time. Changing a fixed-length column to a shorter type, such as changing an int column to smallint or changing a char(10) to char(8), also requires examining all the data to verify that all the existing values can be stored in the new type. However, even though the new data type takes up fewer bytes, the rows on the physical pages are not modified. If you have created a table with an int column, which needs 4 bytes in each row, all rows will use the full 4 bytes. After altering the table to change the int to smallint, we are restricted in the range of data values we can insert, but the rows continue to use 4 bytes for each value, instead of the 2 bytes that smallint requires. You can verify this by using the DBCC PAGE command. Changing a char(10) to char(8) displays similar behavior, and the rows continue to use 10 bytes, but only 8 are allowed to be inserted until the table is rebuilt by creating or re-creating a clustered index. Other changes to a table's structure require SQL Server to physically change every row, and as it makes the changes, it has to write the appropriate records to the transaction log, so these changes can be extremely resource intensive for a large table. Another negative side effect in most cases is that when a column is altered to increase its length, the old column is not actually replaced. A new column is added to the table, and DBCC PAGE shows you that the old data is still there. I'll let you explore the page dumps for this situation on your own, but we can see some of this unexpected behavior by just looking at the column offsets using the column detail query that I showed you earlier in this chapter. First create a table with all fixed-length columns, including a smallint in the first position. CREATE TABLE change (col1 smallint, col2 char(10), col3 char(5)) Now look at the column offsets: SELECT c.name AS column_name, column_id, max_inrow_length, pc.system_type_id, leaf_offset FROM sys.system_internals_partition_columns pc JOIN sys.partitions p ON p.partition_id = pc.partition_id JOIN sys.columns c ON column_id = partition_column_id AND c.object_id = p.object_id WHERE p.object_id=object_id('fixed') RESULTS: column_name column_id max_inrow_length system_type_id leaf_offset ----------- ----------- ---------------- -------------- ----------- col1 1 2 52 4 col2 2 10 175 6 col3 3 5 175 16 Now change the smallint to int: ALTER TABLE change ALTER COLUMN col1 int
And finally, run the column detail query again to see that col1 now starts much later in the row and that no column starts at offset 4 immediately after the row header information. This new column creation due to an ALTER TABLE takes place even before any data has been placed in the table. column_name column_id max_inrow_length system_type_id leaf_offset ----------- ----------- ---------------- -------------- ----------- col1 1 4 56 21 col2 2 10 175 6 col3 3 5 175 16
Another drawback to SQL Server's behavior in not actually dropping the old column is that we are now more severely limited in the size of the row. The row size now includes the old column, which is no longer usable or visible (unless you use DBCC PAGE). For example, if I create a table with a couple of large fixed-length character columns, as shown here, I can then ALTER the char(2000) column to be char(3000). CREATE TABLE bigchange (col1 smallint, col2 char(2000), col3 char(1000)) ALTER TABLE bigchange ALTER COLUMN col2 char(3000)
At this point, the length of the rows should be just over 4,000 bytes because there is a 3,000-byte column, a 1,000-byte column, and a smallint. However, if I try to add another 3000-byte column, it will fail. ALTER TABLE bigchange ADD col4 char(3000) Msg 1701, Level 16, State 1, Line 1 Creating or altering table 'bigchange' failed because the minimum row size would be 9009, including 7 bytes of internal overhead. This exceeds the maximum allowable table row size of 8060 bytes. However, if I just create a table with two 3,000-byte columns and a 1,000-byte column, there will be no problem. CREATE TABLE nochange (col1 smallint, col2 char(3000), col3 char(1000), col4 char(3000)) Note that there is no way to ALTER a table to rearrange the logical column order or to add a new column in a particular position in the table. A newly added column always gets the next highest column_id value. When you execute SELECT * on a table or look at the metadata with sp_help, the columns are always returned in column_id order. If you need a different order, you have several options:
You might think that SQL Server Management Studio can add a new column in a particular position or rearrange the column order, but this is not true. Behind the scenes, the tool is actually using the preceding third option and creating a completely new table with all new indexes, constraints, and triggers. If you wonder why simply adding a new column to an existing (large) table is taking a long time, this is probably the reason. |