Combining Rows in One Table with Rows in Another

12.2.1 Problem

You want to write a query that uses information from more than one table.

12.2.2 Solution

Use a jointhat is, a query that refers to multiple tables and that tells MySQL how to match up information from them.

12.2.3 Discussion

The essential idea behind a join is that it combines rows in one table with rows in one or more other tables. A full join between tables produces all possible combinations of rows. For example, joining a 100-row table to a 200-row table produces a result containing 100 X 200, or 20,000 rows. With larger tables, or joins between more than two tables, a result set can easily become immensepossibly causing the MySQL server to run out of temporary table space. Because of that, and because you rarely want all the combinations anyway, a join normally includes a WHERE clause that narrows the focus of the query. This section introduces basic join syntax, and later sections show how joins help you answer specific types of questions.

Suppose that you're a very unimaginative dresser, and you have trouble picking out your wardrobe each day. So you decide to let MySQL help you. First, enter your shirts into one table and your ties into another:

mysql> CREATE TABLE shirt (item CHAR(20)); mysql> INSERT INTO shirt (item) -> VALUES('Pinstripe'),('Tie-Dye'),('Black'); mysql> CREATE TABLE tie (item CHAR(20)); mysql> INSERT INTO tie (item) -> VALUES('Fleur de lis'),('Paisley'),('Polka Dot');

You can list what's in each table by using separate single-table queries:

mysql> SELECT item FROM shirt; +-----------+ | item | +-----------+ | Pinstripe | | Tie-Dye | | Black | +-----------+ mysql> SELECT item FROM tie; +--------------+ | item | +--------------+ | Fleur de lis | | Paisley | | Polka Dot | +--------------+

But you can also ask MySQL to show you various combinations of wardrobe items by writing a query that performs a join. A join names two or more tables after the FROM keyword. In the output column list, you can name columns from any or all of the joined tables, or use expressions that are based on those columns. The simplest join involves two tables and selects all columns from each. With no WHERE clause, the join generates output for all combinations of rows. Thus, to find all possible combinations of shirts and ties, use the following query to produce a full join between the two tables:

mysql> SELECT * FROM shirt, tie; +-----------+--------------+ | item | item | +-----------+--------------+ | Pinstripe | Fleur de lis | | Tie-Dye | Fleur de lis | | Black | Fleur de lis | | Pinstripe | Paisley | | Tie-Dye | Paisley | | Black | Paisley | | Pinstripe | Polka Dot | | Tie-Dye | Polka Dot | | Black | Polka Dot | +-----------+--------------+

You can see that each item from the shirt table is paired with every item from the tie table. To use the list to guide you in your wardrobe selections, print it out and tape it up on the wall. Each day, wear the items displayed in the first unused row and cross the row off the list.

The output column list in the previous query is specified as *. For a single-table query, an output list of * means "every column from the named table." Analogously, for a join it means "every column from every named table," so the query returns the columns from both shirt and tie. Other ways to specify output columns are to use tbl_name.* to select all columns from a particular table, or tbl_name.col_name to specify a single column from the table. Thus, all the following queries are equivalent:

SELECT * FROM shirt, tie; SELECT shirt.*, tie.* FROM shirt, tie; SELECT shirt.*, tie.item FROM shirt, tie; SELECT shirt.item, tie.* FROM shirt, tie; SELECT shirt.item, tie.item FROM shirt, tie;

The tbl_name.col_name notation that qualifies a column name with a table name is always allowable, but can be shortened to just col_name if the name appears in only one of the joined tables. In that case, MySQL can determine without ambiguity which table the column comes from and no table name qualifier is necessary. We can't do that for a join between shirt and tie; they both have a column with the same name (item), so the following query is ambiguous:

mysql> SELECT item, item FROM shirt, tie; ERROR 1052 at line 1: Column: 'item' in field list is ambiguous

If the columns had distinct names such as s_item and t_item, the query could be written unambiguously without table qualifiers:

SELECT s_item, p_item FROM shirt, tie;

To make the meaning of a query clearer to human readers, it's often useful to qualify column names even when that's not strictly necessary as far as MySQL is concerned. I tend to use qualified names in join query examples for that reason.

Without a WHERE clause to restrict the output, a join produces an output row for every possible combination of input rows. For large tables, this is usually a bad idea, so it's typical to provide some kind of condition on the output rows. For example, if you're tired of having your office mates tease you about your polka dot tie, select only the other stylish combinations that are possible using your wardrobe items:

mysql> SELECT shirt.item, tie.item FROM shirt, tie -> WHERE tie.item != 'Polka Dot'; +-----------+--------------+ | item | item | +-----------+--------------+ | Pinstripe | Fleur de lis | | Tie-Dye | Fleur de lis | | Black | Fleur de lis | | Pinstripe | Paisley | | Tie-Dye | Paisley | | Black | Paisley | +-----------+--------------+

You can also limit the output other ways. To select wardrobe combinations at random, run the following query each morning to pick a single row from the full join:[1]

[1] ORDER BY RAND( ) is discussed further in Chapter 13.

mysql> SELECT shirt.item, tie.item FROM shirt, tie -> ORDER BY RAND( ) LIMIT 1; +---------+--------------+ | item | item | +---------+--------------+ | Tie-Dye | Fleur de lis | +---------+--------------+

It's possible to perform joins between more than two tables. Suppose you set up a pants table:

mysql> SELECT * FROM pants; +----------+ | item | +----------+ | Plaid | | Striped | | Corduroy | +----------+

Then you can select combinations of shirts, ties, and pants:

mysql> SELECT shirt.item, tie.item, pants.item FROM shirt, tie, pants; +-----------+--------------+----------+ | item | item | item | +-----------+--------------+----------+ | Pinstripe | Fleur de lis | Plaid | | Tie-Dye | Fleur de lis | Plaid | | Black | Fleur de lis | Plaid | | Pinstripe | Paisley | Plaid | | Tie-Dye | Paisley | Plaid | | Black | Paisley | Plaid | | Pinstripe | Polka Dot | Plaid | | Tie-Dye | Polka Dot | Plaid | | Black | Polka Dot | Plaid | | Pinstripe | Fleur de lis | Striped | | Tie-Dye | Fleur de lis | Striped | | Black | Fleur de lis | Striped | | Pinstripe | Paisley | Striped | | Tie-Dye | Paisley | Striped | | Black | Paisley | Striped | | Pinstripe | Polka Dot | Striped | | Tie-Dye | Polka Dot | Striped | | Black | Polka Dot | Striped | | Pinstripe | Fleur de lis | Corduroy | | Tie-Dye | Fleur de lis | Corduroy | | Black | Fleur de lis | Corduroy | | Pinstripe | Paisley | Corduroy | | Tie-Dye | Paisley | Corduroy | | Black | Paisley | Corduroy | | Pinstripe | Polka Dot | Corduroy | | Tie-Dye | Polka Dot | Corduroy | | Black | Polka Dot | Corduroy | +-----------+--------------+----------+

Clearly, as you join more tables, the number of row combinations grows quickly, even when each individual table has few rows.

If you don't want to write out complete table names in the output column list, give each table a short alias and refer to table columns using the aliases:

SELECT s.item, t.item, p.item FROM shirt AS s, tie AS t, pants AS p;

Aliases don't save much typing for the preceding statement, but for complicated queries that select many columns, aliases can make life much simpler. In addition, aliases are not only convenient but necessary for some types of queries, as will become evident when we get to the topic of self-joins (Recipe 12.12).

Категории