Database in Depth: Relational Theory for Practitioners

Exercise 5-1.

From a relational perspective, what's wrong (if anything) with the following SQL expressions?

a. SELECT * FROM S, SP b. SELECT S.SNO, S.CITY FROM S c. SELECT SP.SNO, SP.PNO, 2 * SP.QTY FROM SP d. SELECT P.PNO FROM P e. SELECT S.SNO FROM S, SP f. SELECT S.SNO FROM S ORDER BY S.CITY DESC

Exercise 5-2.

Closure is important in the relational model for the same kind of reason that numeric closure is important in ordinary arithmetic. In arithmetic, however, there's one situation where the closure property breaks down, in a sense namely, division by zero. Is there any analogous situation in the relational algebra?

Exercise 5-3.

Given the usual suppliers-and-parts database, what's the value of the expression JOIN{S,SP,P}? What's the corresponding predicate? Warning: There's a trap here; in fact, some might even argue that the trap is such that it makes join (as I've defined it in the body of this chapter) a rather dangerous operation. What do you think of that argument?

Exercise 5-4.

Why do you think the project operator is so called?

Exercise 5-5.

Given our usual sample values for the suppliers-and-parts database, what values do the following expressions denote? In each case, give an informal interpretation of the expression in natural language.

a. S SEMIJOIN ( SP WHERE PNO = PNO('P2') ) b. P NOT MATCHING ( SP WHERE SNO = SNO('S2') ) c. S { CITY } MINUS P { CITY } d. ( S { SNO, CITY } JOIN P { PNO, CITY } ) { ALL BUT CITY } e. JOIN { ( S RENAME ( CITY AS SC ) ) { SC } , ( P RENAME ( CITY AS PC ) ) { PC } }

Exercise 5-6.

Union, intersection, product, and join are all both commutative and associative. Verify these claims. Also verify that semijoin is associative but not commutative.

Exercise 5-7.

In what circumstances (if any) are r SEMIJOIN s and s SEMIJOIN r equivalent?

Exercise 5-8.

Which of the algebraic operators (if any) have a definition that doesn't rely on tuple equality?

Exercise 5-9.

The SQL FROM clause FROM t1, t2, . . . , tn (where each ti is an expression denoting a table) returns the cartesian product of its arguments. But what if n = 1? What's the cartesian product of just one table? And by the way, what's the product of t1 and t2 if t1 and t2 both contain duplicate rows?

Exercise 5-10.

Show that rename isn't primitive.

Exercise 5-11.

Distinguish between a "summary" and an aggregate operator. What do the SQL analogs of these constructs look like?

Exercise 5-12.

Give an expression involving EXTEND instead of SUMMARIZE that's logically equivalent to the following:

SUMMARIZE SP PER ( S { SNO } ) ADD ( COUNT ( ) AS NP )

Exercise 5-13.

Given our usual sample values for the suppliers-and-parts database, what values do the following expressions denote? In each case, give an informal interpretation of the expression in natural language.

a. EXTEND S ADD ( 'Supplier' AS TAG ) b. EXTEND ( P JOIN SP ) ADD ( WEIGHT * QTY AS SHIPWT ) c. EXTEND P ADD ( WEIGHT * 454 AS GMWT, WEIGHT * 16 AS OZWT ) d. EXTEND S ADD ( COUNT ( ( SP RENAME ( SNO AS X ) ) WHERE X = SNO ) AS NP ) e. SUMMARIZE S BY { CITY } ADD ( AVG ( STATUS ) AS AVG_STATUS )

Exercise 5-14.

Which of the following expressions are equivalent?

a. SUMMARIZE r PER ( r { } ) ADD ( COUNT ( ) AS CT ) b. SUMMARIZE r ADD ( COUNT ( ) AS CT ) c. SUMMARIZE r BY { } ADD ( COUNT ( ) AS CT ) d. EXTEND TABLE_DEE ADD ( COUNT ( r ) AS CT )

Exercise 5-15.

Simplifying just slightly, an aggregate operator invocation in Tutorial D takes the form:

<agg op name> ( <relation exp> [, <attribute name> ] )

If the <agg op name> is COUNT, the <attribute name> is irrelevant and must be omitted; otherwise, it can be omitted if and only if the <relation exp> denotes a relation of degree one, in which case the sole attribute of the result of that <relation exp> is assumed by default. Here are a couple of examples:

SUM ( SP WHERE SNO = SNO('S1'), QTY ) SUM ( ( SP WHERE SNO = SNO('S1') ) { QTY } )

Note the difference between these two expressions: the first gives the total of all shipment quantities for supplier S1, and the second gives the total of all distinct shipment quantities for supplier S1. How does the scheme just outlined differ from its SQL counterpart?

Exercise 5-16.

In Tutorial D, if the argument to an aggregate operator invocation happens to be an empty set, COUNT returns zero and so does SUM; MAX and MIN return, respectively, the lowest and the highest value of the applicable type; AND and OR return TRUE and FALSE, respectively; and AVG raises an exception (I ignore Tutorial D's other aggregate operators here deliberately). What does SQL do in these circumstances and why?

Exercise 5-17.

Let relation R4 from Figure 5-1 denote the current value of some relvar R. If R4 has the meaning described in Chapter 2, give the predicate for that relvar R.

Exercise 5-18.

Let r be the relation denoted by the following expression:

SP GROUP ( { } AS X )

What does r look like, given our usual sample value for SP? Also, what does the following expression yield?

r UNGROUP ( X )

Exercise 5-19.

Without using the IS_EMPTY shorthand, write a Tutorial D expression that returns TRUE if the current value of the parts relvar P is empty and FALSE otherwise. Also show an SQL analog of that expression.

Exercise 5-20.

Write Tutorial D and/or SQL expressions for the following queries on the suppliers-and-parts database:

  1. Get all shipments.

  2. Get supplier numbers for suppliers who supply part P1.

  3. Get suppliers with status in the range 15 to 25 inclusive.

  4. Get part numbers for parts supplied by a supplier in London.

  5. Get part numbers for parts not supplied by any supplier in London.

  6. Get city names for cities in which at least two suppliers are located.

  7. Get all pairs of part numbers such that some supplier supplies both of the indicated parts.

  8. Get the total number of parts supplied by supplier S1.

  9. Get supplier numbers for suppliers with a status lower than that of supplier S1.

  10. Get supplier numbers for suppliers whose city is first in the alphabetic list of such cities.

  11. Get part numbers for parts supplied by all suppliers in London.

  12. Get supplier-number/part-number pairs such that the indicated supplier does not supply the indicated part.

  13. Get suppliers who supply at least all parts supplied by supplier S2.

  14. Get supplier numbers for suppliers who supply at least all parts supplied by at least one supplier who supplies at least one London part.

Exercise 5-21.

Prove the following statements (making them more precise, where necessary):

  1. A sequence of restrictions against a given relation can be transformed into a single restriction.

  2. A sequence of projections against a given relation can be transformed into a single projection.

  3. A restriction of a projection can be transformed into a projection of a restriction.

Exercise 5-22.

Union is said to be idempotent, because r UNION r is identically equal to r for all r. (Is this true in SQL?) As you might expect, idempotence can be useful in expression transformation. Which other relational operators (if any) are idempotent?

Exercise 5-23.

Let r be a relation. What does the expression r{ } mean? What does it return?

Exercise 5-24.

The following boolean expression:

x > y AND y > 3

(which might be part of a query) is clearly equivalent to and can therefore be transformed into the following:

x > y AND y > 3 AND x > 3

The equivalence is based on the fact that the comparison operator ">" is transitive. The transformation is worth making if x and y are from different relations, because it enables the system to perform an additional restriction (using x > 3) before doing the greater-than join implied by x > y. As we saw in the body of the chapter, doing restrictions early is generally a good idea; having the system infer additional "early" restrictions, as here, is also a good idea. Do you know of any commercial products that actually perform this kind of optimization?

Exercise 5-25.

Consider the following expressions:

a. WITH ( P WHERE COLOR = 'Purple' ) AS PP , ( SP RENAME ( SNO AS X ) ) AS T : S WHERE ( T WHERE X = SNO ) { PNO } PP { PNO } b. WITH ( P WHERE COLOR = 'Purple' ) AS PP : S JOIN ( SP { SNO, PNO } DIVIDEBY PP { PNO } )

These are both attempts at formulating the query "Get suppliers who supply every purple part." Given our usual sample data values, show the result returned in each case. Which result (and which formulation), if either, do you regard as correct? Justify your answer.

Exercise 5-26.

Consider the restriction r WHERE bx. In the body of the chapter, I said that every attribute mentioned in the restriction condition bx must be an attribute of r, but I also mentioned that bx was subject to certain further limitations. One of those limitations[*] is that, strictly speaking, bx is supposed to consist of just a single term of the form x Op y,where each of x and y is either an attribute of r or a literal and Op is a comparison operator that makes sense for x and y. But a real language will allow WHERE clauses to contain boolean expressions of arbitrary complexity, just as long as the only "free variables" (see Appendix A) in that expression are attributes of r. Show that it's legitimate to extend the definition of restrict in such ways that is, show that such extensions are really just shorthand for something we already know is legitimate.

[*] Another is that bx mustn't involve any quantifiers, nor must it be equivalent to an expression that does (see Appendix A).

Exercise 5-27.

Here are two simple expressions involving relation R4 from Figure 5-1 in the body of the chapter. What queries do they represent?

( R4 WHERE TUPLE { PNO PNO('P2') } PNO_REL ) { SNO } ( ( R4 WHERE SNO = SNO('S2') ) UNGROUP ( PNO_REL ) ) { PNO }

Exercise 5-28.

Given our usual sample values for the suppliers-and-parts database, what does the following expression denote?

EXTEND S ADD ( ( ( SP RENAME ( SNO AS X ) ) WHERE X = SNO ) { PNO } AS PNO_REL )

Exercise 5-29.

Let the relation returned by the expression in the previous exercise be kept as a relvar called SSP. What do the following updates do?

INSERT SSP RELATION { TUPLE { SNO SNO('S6'), PNO_REL RELATION { TUPLE { PNO PNO('P5') } } } } ; UPDATE SSP WHERE SNO = SNO('S2') ( INSERT PNO_REL RELATION { TUPLE { PNO PNO('P5') } } ) ;

Exercise 5-30.

Using relvar SSP from the previous exercise, write expressions for the following queries:

  • Get pairs of supplier numbers for suppliers who supply exactly the same set of parts.

  • Get pairs of part numbers for parts supplied by exactly the same set of supliers.

Exercise 5-31.

A quota query is a query that specifies a desired limit, or quota, on the cardinality of the result: for example, the query "Get the three heaviest parts," for which the specified quota is three. Give Tutorial D and SQL formulations of this query.

Категории