Introduction to Java Programming-Comprehensive Version (6th Edition)
32.2. Relational Database Systems
A database system consists of a database, the software that stores and manages data in the database, and the application programs that present data and enable the user to interact with the database system, as shown in Figure 32.1.
Figure 32.1. A database system consists of data, database management software, and application programs.
A database is a repository of data that together constitute information. When you purchase a database system from a software vendor, such as MySQL, Oracle, IBM, Microsoft, or Sybase, you actually purchase the software comprising a database management system ( DBMS ) from the vendor. Database management systems are designed for use by professional programmers and are not suitable for ordinary customers. Application programs are built on top of the DBMS for customers to access and update the database. Thus application programs can be viewed as the interfaces between the database system and its users. Application programs may be standalone GUI applications or Web applications, and may access several different database systems in the network, as shown in Figure 32.2.
Figure 32.2. An application program may access multiple database systems.
(This item is displayed on page 1095 in the print version)
Most of today's database systems are relational database systems , based on the relational data model. A relational data model has three key components : structure, integrity, and language. Structure defines the representation of the data. Integrity imposes constraints on the data. Language provides the means for accessing and manipulating data.
32.2.1. Relational Structures
The relational model is built around a simple and natural structure. A relation is actually a table that consists of non-duplicate rows. Tables are easy to understand and easy to use. The relational model provides a simple yet powerful way to represent data.
A row of a table represents a record, and a column of a table represents the value of a single attribute of the record. In relational database theory, a row is called a tuple and a column is called an attribute . Figure 32.3 shows a sample table that stores information about the courses offered by a university. The table has eight tuples, and each tuple has five attributes.
Figure 32.3. A table has a table name , column names , and rows.
Tables describe the relationship among data. Each row in a table represents a record of related data. For example, "11111", "CSCI", "1301", "Introduction to Java I", and "4" are related to form a record (the first row in Figure 32.3) in the Course table. Just as data in the same row are related, so too data in different tables may be related through common attributes. Suppose the database has two other tables named Student and Enrollment , as shown in Figures 32.4 and 32.5. The Course table and the Enrollment table are related through their common attribute courseId , and the Enrollment table and the Student table are related through ssn .
Figure 32.4. A Student table stores student information.
(This item is displayed on page 1096 in the print version)
Figure 32.5. An Enrollment table stores student enrollment information.
(This item is displayed on page 1096 in the print version)
32.2.2. Integrity Constraints
An integrity constraint imposes a condition that all the legal values in a table must satisfy . Figure 32.6 shows an example of some integrity constraints in the Subject and Course tables.
Figure 32.6. The Enrollment table and the Course table have integrity constraints.
(This item is displayed on page 1097 in the print version)
In general, there are three types of constraints: domain constraints, primary key constraints, and foreign key constraints. Domain constraints and primary key constraints are known as intra-relational constraints , meaning that a constraint involves only one relation. The foreign key constraint is inter-relational , meaning that a constraint involves more than one relation.
Domain Constraints
Domain constraints specify the permissible values for an attribute. Domains can be specified using standard data types, such as integers, floating-point numbers , fixed-length strings, and variant-length strings. The standard data type specifies a broad range of values. Additional constraints can be specified to narrow the ranges. For example, you can specify that the numOfCredits attribute (in the Course table) must be greater than and less than 5 . You can also specify whether an attribute can be null , which is a special value in a database meaning unknown or not applicable . As shown in the Student table, birthDate may be null .
Primary Key Constraints
To understand primary keys, it is helpful to know superkeys, keys, and candidate keys. A superkey is an attribute or a set of attributes that uniquely identifies the relation. That is, no two tuples have the same values on a superkey. For example, courseId and subjectId together form a superkey, because no two tuples in the course table have the same value on courseId and subjectId. By definition, a relation consists of a set of distinct tuples. The set of all attributes in the relation forms a superkey.
A key K is a minimal superkey, meaning that any proper subset of K is not a superkey. A relation can have several keys. In this case, each of the keys is called a candidate key . The primary key is one of the candidate keys designated by the database designer. The primary key is often used to identify tuples in a relation. As shown in Figure 32.6, courseId is the primary key in the Course table.
Foreign Key Constraints
In a relational database, data are related. Tuples in a relation are related, and tuples in different relations are related through their common attributes. Informally speaking, the common attributes are foreign keys. The foreign key constraints define the relationships among relations.
Formally , a set of attributes FK is a foreign key in a relation R that references relation T if it satisfies the following two rules:
-
The attributes in FK have the same domain as the primary key in T .
-
A non-null value on FK in R must match a primary key value in T .
As shown in Figure 32.6, courseId is the foreign key in Enrollment that references the primary key courseId in Course . Every courseId value must match a courseId value in Course .
Enforcing Integrity Constraints
The database management system enforces integrity constraints and rejects operations that would violate them. For example, if you attempt to insert a new record ( '11113' , '3272' , 'Database Systems' , ) into the Course table, it would fail because the credit hours must be greater than or equal to 0; if you attempt to insert a record with the same primary key as an existing record in the table, the DBMS would report an error and reject the operation, because the primary key values are unique; if you attempt to delete a record from the Course table whose primary key value is referenced by the records in the Enrollment table, the DBMS would reject this operation because it would violate the foreign key constraint.
Note
| All relational database systems support primary key constraints and foreign key constraints. Not all database systems support domain constraints. For example, you cannot specify the constraint that numOfCredits is greater than and less than 5 on the Microsoft Access database. |