Dimensional Data Warehousing with MySQL: A Tutorial
Before you can run the revised regular population script, there are a few things you need to prepare. First, you need to prepare the following customer.csv file, which contains two changes from the customer.csv in Chapter 10:
-
The street and shipping addresses of customer number 4 changes from zip code 17050 to 17055
-
A new customer number 15 is added.
Here is the content of the revised customer.csv file.
CUSTOMER NO, CUSTOMER NAME,STREET ADDRESS, ZIP CODE,CITY,STATE,SHIPPING ADDRESS, ZIP CODE,CITY,STATE 1, Really Large Customers, 7500 Louise Dr., 17050, Mechanicsburg, PA, 7500 Louise Dr., 17050, Mechanicsburg, PA 2, Small Stores, 2500 Woodland St., 17055, Pittsburgh, PA, 2500 Woodland St., 17055, Pittsburgh, PA 3, Medium Retailers, 1111 Ritter Rd., 17055, Pittsburgh, PA, 1111 Ritter Rd., 17055, Pittsburgh, PA 4, Good Companies, 9999 Louise Dr., 17055, Pittsburgh, PA, 9999 Louise Dr., 17055, Pittsburgh, PA 5, Wonderful Shops, 3333 Rossmoyne Rd., 17050, Mechanicsburg, PA, 3333 Rossmoyne Rd., 17050, Mechanicsburg, PA 6, Extremely Loyal Clients, 7777 Ritter Rd., 17055, Pittsburgh, PA, 7777 Ritter Rd., 17055, Pittsburgh, PA 7, Distinguished Agencies, 9999 Scott St., 17050, Mechanicsburg, PA, 9999 Scott St., 17050, Mechanicsburg, PA 8, Subsidiaries, 10000 Wetline Blvd., 17055, Pittsburgh, PA, 10000 Wetline Blvd., 17055, Pittsburgh, PA 9, E-Distributors, 2323 Louise Dr., 17055, Pittsburgh, PA, 2323 Louise Dr., 17055, Pittsburgh, PA 10, Bigger Customers, 7777 Ridge Rd., 44102, Cleveland, OH, 7777 Ridge Rd., 44102, Cleveland, OH 11, Smaller Stores, 8888 Jennings Fwy., 44102, Cleveland, OH, 8888 Jennings Fwy., 44102, Cleveland, OH 12, Small-Medium Retailers, 9999 Memphis Ave., 44102, Cleveland, OH, 9999 Memphis Ave., 44102, Cleveland, OH 13, PA Customer, 1111 Louise Dr., 17050, Mechanicsburg, PA, 1111 Louise Dr., 17050, Mechanicsburg, PA 14, OH Customer, 6666 Ridge Rd., 44102, Cleveland, OH, 6666 Ridge Rd., 44102, Cleveland, OH 15, Super Stores, 1000 Woodland St., 17055, Pittsburgh, PA, 1000 Woodland St., 17055, Pittsburgh, PA
Now query the latest customer and shipping zip codes before you load the new customer data. Later you can compare this query output with the one after the changes.
mysql> SELECT order_date_sk odsk, customer_number cn, -> customer_zip_code czc, shipping_zip_code szc -> FROM customer_zip_code_dim a, shipping_zip_code_dim b, -> sales_order_fact c, customer_dim d -> WHERE a.customer_zip_code_sk = c.customer_zip_code_sk -> AND b.shipping_zip_code_sk = c.shipping_zip_code_sk -> AND d.customer_sk = c.customer_sk -> GROUP BY customer_number -> HAVING MAX (order_date_sk);
+------+------+-------+-------+ | odsk | cn | czc | SZC | +------+------+-------+-------+ | 732 | 1 | 17050 | 17050 | | 732 | 2 | 17055 | 17055 | | 732 | 3 | 17055 | 17055 | | 732 | 4 | 17050 | 17050 | | 732 | 5 | 17050 | 17050 | | 732 | 6 | 17055 | 17055 | | 732 | 7 17050 | 17050 | | 732 | 8 | 17055 | 17055 | | 732 | 9 | 17055 | 17055 | | 746 | 11 | 44102 | 44102 | | 746 | 12 | 44102 | 44102 | | 746 | 13 | 17050 | 17050 | | 746 | 14 | 44102 | 44102 | +------+------+-------+-------+ 13 rows in set (0.42 sec)
Next, use the script in Listing 23.7 to add two sales orders.
-
customer number 4 whose address has recently changed
-
new customer number 15
Listing 23.7: Adding two sales orders
/*******************************************************************/ /* */ /* sales_order_23.sql */ /* */ /*******************************************************************/ USE source; INSERT INTO sales_order VALUES (64, 4, 3, 'Y', 'Y', 'Y', 'N', '2007-03-27', 'N', '2007-03-31', '2007-03-27', 10000, 100) , (65, 15, 4, 'Y', 'N', 'Y', 'N', '2007-03-27', 'N', '2007-03-31', '2007-03-27', 20000, 200) ; /* end of script */
Run the script in Listing 23.7 using this command.
mysql> \. c:\mysql\scripts\sales_order_23.sql
You will see the following response on the console.
Database changed Query OK, 2 rows affected (0.06 sec) Records: 2 Duplicates: 0 Warnings: 0
You’re now ready to test the revised regular population. You must set your MySQL date to March 27, 2007 (the entry date of the two sales orders above) and run the dw_regular_23 script:
mysql> \. c:\mysql\scripts\dw_regular_23.sql
You should see something similar to the following on your console.
Database changed Query OK, 9 rows affected (0.10 sec) Query OK, 15 rows affected (0.05 sec) Records: 15 Deleted: 0 Skipped: 0 Warnings: 0 Query OK, 1 row affected (0.06 sec) Rows matched: 1 Changed: 1 Warnings: 0 Query OK, 1 row affected (0.06 sec) Records: 1 Duplicates: 0 Warnings: 0 Query OK, 1 row affected (0.05 sec) Rows matched: 1 Changed: 1 Warnings: 0 Query OK, 1 row affected (0.06 sec) Records: 1 Duplicates: 0 Warnings: 0 Query OK, 5 rows affected (0.07 sec) Query OK, 5 rows affected (0.06 sec) Records: 5 Deleted: 0 Skipped: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Rows matched: 0 Changed: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Query OK, 3 rows affected (0.02 sec) Records: 3 Duplicates: 0 Warnings: 0 Query OK, 2 rows affected (0.01 sec) Records: 2 Duplicates: 0 Warnings: 0 Query OK, 2 rows affected (0.16 sec) Records: 2 Duplicates: 0 Warnings: 0 Query OK, 19 rows affected (0.05 sec) Query OK, 21 rows affected (0.08 sec) Records: 21 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Rows matched: 0 Changed: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Rows matched: 0 Changed: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Rows matched: 0 Changed: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Rows matched: 0 Changed: 0 Warnings: 0
Confirm that the two customer changes, customer number 4 and 15, have been loaded correctly by querying the customer_dim table.
mysql> select * from customer_dim where customer_number in (4, 15) \G
Here is the query result.
*************************** 1. row *************************** customer_sk: 4 customer_number: 4 customer_name: Good Companies customer_street_address: 9500 Scott St. shipping_address: NULL effective_date: 2005-03-01 expiry_date: 2007-03-01 *************************** 2. row *************************** customer_sk: 13 customer_number: 4 customer_name: Good Companies customer_street_address: 9500 Scott St. shipping_address: 9500 Scott St. effective_date: 2007-03-02 expiry_date: 2007-03-26 *************************** 3. row *************************** customer_sk: 24 customer_number: 4 customer_name: Good Companies customer_street_address: 9999 Louise Dr. shipping_address: 9999 Louise Dr. effective_date: 2007-03-27 expiry_date: 9999-12-31 *************************** 4. row *************************** customer_sk: 25 customer_number: 15 customer_name: Super Stores customer_street_address: 1000 Woodland St. shipping_address: 1000 Woodland St. effective_date: 2007-03-27 expiry_date: 9999-12-31 4 rows in set (0.00 sec)
To confirm the zip codes have been correctly loaded, query the sales_order_fact table on the two new sales orders using this SQL statement.
mysql> select from sales_order_fact where order_number IN (64, 65) \G
You should get the following result.
*************************** 1. row *************************** customer_sk: 24 customer_zip_code_sk: 6 shipping_zip_code_sk: 6 product_sk: 4 sales_order_attribute_sk: 3 order_date_sk: 757 allocate_date_sk: NULL packing_date_sk: NULL ship_date_sk: NULL receive_date_sk: NULL entry_date_sk: 757 order_number: 64 request_delivery_date_sk: 761 order_amount: 10000.00 order_quantity: 100 allocate_quantity: NULL packing_quantity: NULL ship_quantity: NULL receive_quantity: NULL *************************** 2. row *************************** customer_sk: 25 customer_zip_code_sk: 6 shipping_zip_code_sk: 6 product_sk: 5 sales_order_attribute_sk: 5 order_date_sk: 757 allocate_date_sk: NULL packing_date_sk: NULL ship_date_sk: NULL receive_date_sk: NULL entry_date_sk: 757 order_number: 65 request_delivery_date_sk: 761 order_amount: 20000.00 order_quantity: 200 allocate_quantity: NULL packing_quantity: NULL ship_quantity: NULL receive_quantity: NULL 2 rows in set (0.00 sec)
Note | The output confirms correct population of the sales_order_fact table. The zip_code_sk 6 is Mechanicsburg, which is the correct zip code of the customer and shipping addresses. |