Hack 55. Import Someone Elses Data
Hack 55 Import Someone Else s Data
If you are building an application that depends on external data, you can set up a "bolt-on" table that can be refreshed at regular intervals.
Suppose you are building an application that is attached to, but not part of, some larger system. For example, if you were building a system to manage the assignment of space in the company parking garage, you might choose a table design such as that shown in Table 7-13.
id | name | phone | parkingSpace |
---|---|---|---|
E01 | Harpo | 2753 | F8 |
E02 | Zeppo | 2754 | G22 |
E03 | Groucho | 2755 | F7 |
Details such as the name and phone number can come from the employee table of another database; the parkingSpace column is to be updated by your new application. Although the table shown might be the most sensible structure for a standalone system, it can be difficult to keep it in step with the external system.
Instead, you could use two tables and maintain a one-to-one relationship between them. You keep the data that your application maintains in one table and the imported data in another. Using this approach, your employeeParking table looks like Table 7-14, and the imported copy of the employee table looks like Table 7-15.
id | parkingSpace |
---|---|
E01 | F8 |
E02 | G22 |
E03 | F7 |
id | name | phone |
---|---|---|
E01 | Harpo | 2753 |
E02 | Zeppo | 2754 |
E03 | Groucho | 2755 |
Now the external details are separated from the details that are fully part of your system, and the only cost is a join that would otherwise not be required.
You can create a view that mimics the original table that you wanted:
CREATE VIEW mimic AS SELECT employeeParking.id, name, phone, parkingSpace FROM employeeParking JOIN employeeCopy ON (employeeParking.id = employeeCopy.id)
The process of updating from the main system is now simply a matter of deleting all of the rows in the employeeCopy table and then refilling it from a freshly imported data set.
If the data does get out of step you can compensate to some extent. For example, if a new employee is added to your database before he has been added to the main employee table (or at least before you make a copy of it) you can make that JOIN into a LEFT OUTER JOIN so that at least some of the details are available:
CREATE VIEW mimic AS SELECT employeeParking.id ,COALESCE(name, employeeParking.id) AS name ,COALESCE(phone, 'Not available') AS phone ,parkingSpace FROM employeeParking LEFT OUTER JOIN EmployeeCopy ON (employeeParking.id = employeeCopy.id)
If the new staff member (E04) is in the employeeParking table but not in the employeeCopy table, you will get the following output:
mysql> SELECT * FROM mimic; +-----+---------+---------------+--------------+ | id | name | phone | parkingSpace | +-----+---------+---------------+--------------+ | E01 | Harpo | 2753 | F8 | | E02 | Zeppo | 2754 | G22 | | E03 | Groucho | 2755 | F7 | | E04 | E04 | Not available | F21 | +-----+---------+---------------+--------------+
It is not ideal, but it may be good enough as a temporary measure. When a more up-to-date version of the employee table is reimported, the actual name and phone number will become visible.
If you want to see how badly out of step the two sets of data are, you can execute a FULL OUTER JOIN and count the NULL values from each table:
scott=> SELECT SUM(CASE WHEN employeeCopy.id IS NULL THEN 1 ELSE 0 END) scott-> AS externalNotInternal scott-> ,SUM(CASE WHEN employeeParking.id IS NULL THEN 1 ELSE 0 END) scott-> AS internalNotExternal scott-> FROM employeeParking FULL OUTER JOIN scott-> employeeCopy ON (employeeParking.id = employeeCopy.id); externalnotinternal | internalnotexternal ---------------------+--------------------- 1 | 0
|
Additions and changes to the employee table become visible to the parking system as soon as the data has been transferred. However, if an employee is deleted from the bolt-on table the parking system does not remove that employee.
Of course, if you want to allow the tables to be out of step you may not enforce referential integrity with a foreign key.