Microsoft SQL Server 7.0 System Administration Training Kit
When you create a database, you set up the data storage structure. This structure includes at least one data file and one transaction log file. Before creating a database, it is important to understand two concepts: how SQL Server 7 stores data and the function of the transaction log file.
After this lesson, you will be able to
- Describe how data is stored in databases and transaction logs.
- Evaluate database storage considerations
Estimated lesson time: 15 minutes
How Data Is Stored
An understanding of how SQL Server stores data will give you insight into capacity planning, data integrity, and performance. Figure 5.1 shows how a database allocates space.
Figure 5.1 Database storage allocation
SQL Server Database Files
SQL Server 7 introduces a new architecture for databases. Database devices, used in previous versions, are no longer needed as storage units for a database. Instead, databases are stored in their own files. The following section introduces these files, which make up a database.
Database File Types
There are three types of database files:
- Primary data files—As its name indicates, this file is key. Every database must have one, but only one, primary data file. Primary data filenames usually have an .MDF extension.
- Secondary data files—These files can hold all data and objects that are not on the primary data file. Secondary data files are not required. You can choose not to use one, or you can create multiple secondary files. Secondary data filenames usually have an .NDF extension.
- Log files—These files hold all of the transaction log information required to recover the database. Every database has at least one log file. Log filenames usually have an .LDF extension.
Database File Considerations
Before creating a database, you must consider a number of issues. To effectively manage resources that the database will use, you must understand how SQL Server stores information. The following list summarizes how SQL Server allocates space to a database and database objects.
- When you create a database, a copy of the model database which includes the system tables, is copied into the database. The minimum size of a database, therefore, is equal to or greater than the size of the model database.
- Data is stored in 8-KB blocks of contiguous disk space. These blocks are called pages. A database can store 128 pages per megabyte (MB).
- Rows cannot span pages. The maximum amount of data that can possibly be stored in a single row is 8060 bytes. The maximum amount of space that can be used by all rows on a page is 8094 bytes. Each row has some overhead associated with it, which is why the maximum row length is smaller than the space available on each page to store rows.
NOTE
SQL Server 7 differs significantly from SQL Server 6.5 with regard to data storage. SQL Server 6.5 uses 2-KB pages, thereby limiting a single row to 1962 bytes. Many other capacities have been increased in SQL Server 7; for more information, search for "maximum capacity specifications" in Books Online.
- Tables, database objects, and indexes are stored in extents. An extent is 8 contiguous pages or 64 KB in size; therefore, a database has 16 extents per megabyte. When as many as eight small objects share an extent, it is called a mixed extent. Once a table grows to 8 pages, it uses its own uniform extent.
- Transaction log files hold all of the information necessary for recovery of the database in case of a system failure. By default, the size of the transaction log is 25 percent of the size of the data files. When planning a database, use this ratio as a guide and adjust it according to the needs of your application.
How the Transaction Log Works
The transaction log records data modifications—INSERT, UPDATE, and DELETE statements—as they are executed.
The logging process, shown in Figure 5.2, occurs as follows:
- The application sends a data modification.
- The affected data pages are loaded from the data file into memory (called the data cache) if they are not already in the data cache from a previous query.
- Each data modification statement is recorded as it is made in what is called a write-ahead log. The change is always recorded in the log and written to the log file before that change is made in the database.
- Once the data pages reside in the data cache and the log pages are recorded on the disk in the transaction log file, the checkpoint process writes all committed transactions to the database on the disk.
A single transaction can include many data modifications. Each transaction starts with a BEGIN TRANSACTION marker. If the application completes all data modifications successfully, the transaction ends with a COMMIT TRANSACTION marker. Such a transaction is referred to as a committed transaction.
Figure 5.2 How the transaction log works
During normal operation, the checkpoint process routinely checks for committed transactions that have not been written to the data file. The checkpoint process writes the modifications to the data file and checkpoints the transaction to indicate that it has been written to the data file.
If the system fails, the automatic recovery process begins when the SQL Server is restarted. This process uses the transaction log to roll forward (apply the modifications to the data) all committed transactions that have not been checkpointed and roll back (remove) any incomplete transactions.
Transaction logging in SQL Server is not optional. You cannot turn it off, and all normal data modifications must go through the transaction log. For this reason, it is important that the transaction log never become full, as this will prevent data modification in the database. SQL Server does, however, allow two methods of bulk data loading that bypass transaction logging: the bulk copy program (bcp) and the SELECT INTO statement.
CAUTION
Under most circumstances if the hard disk of the computer has a disk-caching controller, you should disable it. Unless a write-caching disk controller is designed specifically for a database, it can seriously harm SQL Server data integrity. Check with your vendor to determine whether the hardware write-caching mechanism of your hard disk is designed for use with a database server.
Lesson Summary
Understanding the structure of a database is crucial to effectively managing resources. The appropriate use of the transaction log can lead to quicker data processing as well as easier data recovery in case of data loss. Understanding how a database uses space allows for better planning and therefore less waste.