Hack 52. Display Columns As Rows
Sometimes you have data coming into your system that is not normalized. You might be getting data from another database or it might simply be more convenient to enter the data in that format.
You may need to convert values stored within columns into separate rows. Suppose that you are reversing the procedure from "Display Rows As Columns" [Hack #51] (reversing this procedure is called denormalization because it takes the database out of one of the normal forms, which are guidelines for structuring databases). You have a table of results where each row has one student with three course grades but the structure you want has one student, one course, and one result per row.
The input data has the format shown in Table 7-8. Table 7-9 shows the structure you want.
student | Java | Database | Algebra |
---|---|---|---|
Gao Cong | 80 | 77 | 50 |
Dongyan Zhou | 62 | 95 | 62 |
student | course | grade |
---|---|---|
Gao Cong | Java | 80 |
Gao Cong | Database | 77 |
Gao Cong | Algebra | 50 |
Dongyan Zhou | Java | 62 |
Dongyan Zhou | Database | 95 |
Dongyan Zhou | Algebra | 62 |
Every row of the gradesIn table must generate three rows of the output table. You can get this with an INSERT statement based on a UNION:
SELECT student, 'Java', Java FROM gradesIn UNION SELECT student, 'Database', Database FROM gradesIn UNION SELECT student, 'Algebra', Algebra FROM gradesIn;
The trick here is to line up the literal values such as 'Java' with the column names such as Java. The three-way UNION ensures that the input table is scanned three times, so the total number of records added will be three times the number of students in the input table.
7.4.1. Ungroup Data with Repeating Columns
Sometimes the denormalized data is fundamentally sequential, but it has been grouped into columns. Take the daily rainfall data shown in Table 7-10. A table is available with rainfall measurements for each day, grouped into seven days per row. To make querying easier, you might want a table that looks more like Table 7-11.
weekBeginning | mon | tues | weds | thur | fri | sat | sun |
---|---|---|---|---|---|---|---|
14 Aug 06 | 10 | 11 | 0 | 0 | 16 | 22 | 28 |
21 Aug 06 | 5 | 5 | 0 | 10 | 18 | 26 | 25 |
... |
dy | rainfall |
---|---|
14 Aug 2006 | 10 |
15 Aug 2006 | 11 |
16 Aug 2006 | 0 |
... | |
21 Aug 2006 | 5 |
... |
You can get this data into a normalized structure, but you must make sure that every row of the input generates seven rows of output:
mysql> INSERT INTO outRain(dy,rainfall) -> SELECT weekBeginning + 0, mon FROM inRain -> UNION SELECT weekBeginning + 1, tues FROM inRain -> UNION SELECT weekBeginning + 2, weds FROM inRain -> UNION SELECT weekBeginning + 3, thur FROM inRain -> UNION SELECT weekBeginning + 4, fri FROM inRain -> UNION SELECT weekBeginning + 5, sat FROM inRain -> UNION SELECT weekBeginning + 6, sun FROM inRain; Query OK, 14 rows affected (0.00 sec) Records: 14 Duplicates: 0 Warnings: 0 mysql> SELECT * FROM outRain; +------------+----------+ | dy | rainfall | +------------+----------+ | 2006-08-14 | 10 | | 2006-08-21 | 5 | | 2006-08-15 | 11 | ... | 2006-08-20 | 28 | | 2006-08-27 | 25 | +------------+----------+ 14 rows in set (0.00 sec)
In Oracle, you can also use the INSERT ALL statement. The expressions in the VALUES lists reference the results of the SELECT line:
INSERT ALL INTO outRain(dy,rainfall) VALUES (weekBeginning + 0, mon) INTO outRain(dy,rainfall) VALUES (weekBeginning + 1, tues) INTO outRain(dy,rainfall) VALUES (weekBeginning + 2, weds) INTO outRain(dy,rainfall) VALUES (weekBeginning + 3, thur) INTO outRain(dy,rainfall) VALUES (weekBeginning + 4, fri) INTO outRain(dy,rainfall) VALUES (weekBeginning + 5, sat) INTO outRain(dy,rainfall) VALUES (weekBeginning + 6, sun) SELECT weekBeginning,mon,tues,weds,thur,fri,sat,sun FROM inRain;