Database in Depth: Relational Theory for Practitioners

Exercise 7-1

Give definitions, as precisely as you can, of functional dependency and join dependency.

Exercise 7-2

List all of the FDs, trivial as well as nontrivial, satisfied by our usual shipments relvar SP.

Exercise 7-3

The concept of FD relies on the notion of tuple equality: true or false?

Exercise 7-4

Prove Heath's theorem. Prove also that the converse of that theorem isn't valid.

Exercise 7-5

Nonloss decomposition means a relvar is decomposed into projections in such a way that we can recover the original relvar by joining those projections back together again. In fact, if projections r1 and r2 of relation r are such that every attribute of r appears in at least one of r1 and r2, then joining r1 and r2 will always produce every tuple of r. Prove this assertion. (It follows from the foregoing that the problem with a lossy decomposition is that the join produces additional, "spurious" tuples. Since we have no way in general of knowing which tuples in the result are spurious and which genuine, we've lost information.)

Exercise 7-6

What's a superkey? What does it mean to say an FD is implied by a superkey? What does it mean to say a JD is implied by a superkey?

Exercise 7-7

Keys are supposed to be unique and irreducible. Now, the system is obviously capable of enforcing uniqueness; but what about irreducibility?

Exercise 7-8

What's (a) a trivial FD, (b) a trivial JD? Is the former a special case of the latter?

Exercise 7-9

Let R be a relvar of degree n. What's the maximum number of FDs that R can possibly satisfy (trivial as well as nontrivial)?

Exercise 7-10

Given that A and B in the FD A sets of attributes, what happens if either is the empty set?

Exercise 7-11

Here's a predicate: on day d during period p, student s is attending lesson l, which is being taught by teacher t in classroom c (where d is a day of the week Monday to Friday and p is a period 1 to 8 within the day). Lessons are one period in duration and have a name that's unique with respect to all lessons taught in the week. Design a set of BCNF relvars for this database. Are your relvars in 5NF? 6NF? What are the keys?

Exercise 7-12

Most of the examples of nonloss decomposition in the body of the chapter showed a relvar being decomposed into exactly two projections. Is it ever necessary to decompose into three or more?

Exercise 7-13

Many database designers recommend the use of artifical or surrogate keys in base relvars in place of what are sometimes called "natural" primary keys. For example, we might add an attribute SPNO, say to our usual shipments relvar (making sure it has the uniqueness property, of course) and then make {SPNO} a surrogate primary key for that relvar. (Note, however, that {SNO,PNO} would still be a key; it just wouldn't be the primary key any longer.) Thus, surrogate keys are keys in the usual relational sense, but (a) they always involve exactly one attribute and (b) their values serve solely as surrogates for the entities they stand for (that is, they serve merely to represent the fact that those entities exist they carry absolutely no additional meaning or baggage of any kind). Ideally, those surrogate values would be system-generated, but whether they're system- or user-generated has nothing to do with the basic idea of surrogate keys as such. Are surrogate keys the same thing as tuple IDs? And do you think they're a good idea?

Exercise 7-14

(With acknowledgments to Hugh Darwen.) I decided to throw a party, so I drew up a list of people I wanted to invite and made some preliminary soundings. The response was good, but several people made their acceptance conditional on the acceptance of certain other invitees. For example, Bob and Cal both said they would come if Amy came; Hal said he would come if either Don and Eve both came or Fay came; Guy said he would come anyway; Joe said he would come if Bob and Amy both came; and so on. Design a database to show whose acceptance is based on whose.

Exercise 7-15

Design a database for the following. The entities to be represented are employees and programmers. Every programmer is an employee, but some employees aren't programmers. Employees have an employee number, name, and salary. Programmers have a (single) programming language skill. What difference would it make if programmers could have an arbitrary number of such skills?

Exercise 7-16

Let A, B,and C be subsets of the heading of relvar R such that the (set-theoretic) union of A, B, and C is equal to that heading. Let AB denote the (set-theoretic) union of A and B,and similarly for AC. Then R satisfies the multi-valued dependencies (MVDs):

A A (where A A double arrow B" or "A multi-determines B" or "B is multi-dependent on A," and similarly for A R satisfies the JD {AB,AC}. Show that if relvar R satisfies the MVDs A A <a,b1,c1> and <a,b2,c2>, then it also includes the pair of tuples <a,b1,c2> and <a,b2,c1>.

Exercise 7-17

Show that if R satisfies the FD A A Exercise 7-18

(Fagin's theorem.) Let R be as in Exercise 7-16. Show that R can be nonloss-decomposed into its projections on AB and AC if and only if it satisfies the MVDs A A Exercise 7-19

Show that if K is a key for R, then K A of R.

NOTE

Here is a convenient place to introduce some more definitions. Recall that R is in 4NF if and only if every nontrivial MVD it satisfies is implied by some superkey. The MVD A trivial if and only if AB is equal to the heading of R or A is a superset of B; it's implied by a superkey if and only if A is a superkey.

Exercise 7-20

Give an example of a relvar that's in BCNF and not 4NF.

Exercise 7-21

Design a database for the following. The entities to be represented are sales representatives, sales areas, and products. Each representative is responsible for sales in one or more areas; each area has one or more responsible representatives. Each representative is responsible for sales of one or more products, and each product has one or more responsible representatives. Each product is sold in each area; however, no two representatives sell the same product in the same area. Each representative sells the same set of products in each area for which that representative is responsible.

Exercise 7-22

Write a Tutorial D CONSTRAINT statement to express the JD satisfied by relvar SPJ of Figure 7-5.

Exercise 7-23

(Modified version of Exercises 7-27.) Design a database for the following. The entities to be represented are sales representatives, sales areas, and products. Each representative is responsible for sales in one or more areas; each area has one or more responsible representatives. Each representative is responsible for sales of one or more products, and each product has one or more responsible representatives. Each product is sold in one or more areas, and each area has one or more products sold in it. Finally, if representative r is responsible for area a, product p is sold in area a, and representative r sells product p, then r sells p in a.

Exercise 7-24

Which of the following are true statements?

  1. Every "all key" relvar is in BCNF.

  2. Every "all key" relvar is in 5NF.

  3. Every binary relvar is in BCNF.

Exercise 7-25

There's a lot of discussion in the industry at the time of writing about the possibility of XML databases. But XML documents are inherently hierarchic in nature. Do you think the criticisms of hierarchies in the body of the chapter apply to XML databases? Justify your answer.

Exercise 7-26

Draw E/R diagrams for the databases from Exercises 7-11,7-147-21, and 7-23. What do you conclude from this exercise?

Exercise 7-27

A certain database includes two base relvars that look like this:

FATHER_OF { X NAME, Y NAME } KEY { X } MOTHER_OF { X NAME, Y NAME } KEY { X }

The predicates are The father of X is Y and The mother of X is Y, respectively. For simplicity, no constraints are defined, except for the two KEY constraints. Comment on this design.

Exercise 7-28

This chapter has been concerned with data design; in essence, I've discussed certain aspects of relational design, and there's a strong argument that you should do a clean relational design first, even if your target DBMS isn't relational at all. In the same kind of way, it's sometimes suggested that transaction or query design might be done by defining the transaction or query in terms of relational algebra first, and then mapping that definition into SQL (or whatever your target language is) as a follow-on activity afterward. What do you think of this strategy?

Категории