Hands-On Microsoft Access: A Practical Guide to Improving Your Access Skills

Suppose you're a company that sells food items, like Northwind. You have a Products table with all the items you sell, with one row for each item: ice cream, yogurt, nondairy creamer, and so on. Among the columns, you include a field to store all the flavors for that item: vanilla, blueberry, Irish crème (see Table 2.14).

Table 2.14. Tables That Include Multivalue Fields

Products

Product

Supplier

Flavors

<<other fields>>

Ice cream

Frozen Delights

Vanilla, Chocolate, Strawberry

 

Yogurt

Heavenly Cultures

Vanilla, Blueberry, Asparagus, Strawberry

 

Nondairy Creamer

Foodstuffs Inc.

Vanilla, Asparagus, Hazelnut

 

Authors

Last Name

First Name

Major Works

<<other fields>>

Dostoevsky

Fyodor

The Brothers Karamazov, Crime and Punishment, The Idiot, The Gambler

 

Turgenev

Ivan

Fathers and Sons

 

Tolstoy

Leo

War and Peace, Anna Karenina

 

Or consider a table of Russian authors. Each row is dedicated to one author. You have fields for first name, last name, and birthday. You also have a column for the author's major works (see Table 2.14).

The Flavors field in the Products table and the Major Works field in the Authors table are known as multivalue fields. These are fields that store more than one value in each row. They're usually easy to identify because, as in the examples, they use commas to separate the various values. The fields are also usually named in the pluralflavors, characters, works, and so on.

I think multivalue fields are among the most difficult database problems to handle, both conceptually and practically. At first glance, the Flavors field in the Products table and the Major Works field in the Authors table both seem like fine, even elegant, solutions to a common problem. If you know that the number of values for a single characteristicproduct flavors, authors' major works, and so onwill vary among records, why try to predict the number of columns you need? Instead of using threeor six, or eight, or who knows how manycolumns to store values, wouldn't the best solution be to store all values in a single field and separate them by commas?

But you are designing your database so you can quickly and efficiently retrieve the data you need. Suppose you decide to discontinue an experimental flavorsay, asparagusfrom all your products. If Asparagus is always a discrete record, it's easy to delete all occurrences of Asparagus from your database. If Asparagus is but one value in a multivalue field, however, you need to edit each of those records. That task is more difficult and, equally important, more likely to result in error. The threat to data integrity is substantial.

Or suppose you want to retrieve all records on Dostoevsky's Crime and Punishment. Surrounded as it is between The Brothers Karamazov and The Idiot, it will be inefficient to find information on that one book. Or say you want to sort all the books by title. A multivalue field will make it difficult to put all the books in alphabetical order.

Put simply, multivalue fields are difficult to edit, sort, and delete. They violate the essence of good relational database design. You need to get rid of them.

Multivalue Fields in the Classic TV Database

Earlier you saw that the Programs table had a multipart field, Character Names/Character Occupations. To resolve this multipart field, you created separate Character Names and Character Occupations fields (see Table 2.15).

Splitting the field into two columns resolved the problem of having two different types of data in the same field. But the Character Names and Character Occupations remain multivalue fields. To repeat, they have the two telltale signs of multivalue fields: The values are separated by commas, and the field name is plural.

Resolving Multivalue Fields

So how do you fix these multivalue fields?

First, take both the Character Names and Character Occupations fields out of the Programs table and put them in a new table called Roles. Second, take the primary key from the Programs table and put it in the Roles table as well. Finally, let's split the new Character Name (now singular) field into Last Name and First Name fields to eliminate the multipart field. A separate field is assigned to the character occupation (see Table 2.16).

Table 2.16. Roles

Multivalue Fields Resolved

ProgramID

Last Name

First Name

Occupation

1 (Andy Griffith Show)

Taylor

Andy

Sheriff

1 (Andy Griffith Show)

Fife

Barney

Deputy Sheriff

1 (Andy Griffith Show)

Pyle

Gomer

Filling station attendant

2 (Happy Days)

Cunningham

Richie

Student

2 (Happy Days)

Fonzarelli

Arthur

Motorcyclist

NOTE

Next to the ProgramID, I've included in italics the name of the TV show the ID refers to. These names are not part of the table. I've (reluctantly) included them only so you can see the program the ID refers to.

This solution seems counterintuitive; it provides a complex, even convoluted, solution to what seems to be a simple problem.

But using keys to resolve multivalue fields gives you a small taste of how you are going to integrate data in your database. As you'll see, putting the primary key of the original table into the new table enables you to establish a relationship between the original and new tables. By using primary keys to represent the values in each record, you make it possible to bring together values from different tables and integrate the information in your database.

Q&A

Q1:

Wait a minute: The ProgramID field has a lot of duplicates. Didn't you tell us that a primary key can't have duplicate values?

A1:

ProgramID is the primary key of the Programs table; it's not the primary key in the Roles table. You haven't assigned a primary key to the Roles table yet; you do so in the next chapter.

Категории