Reverse Engineering and Importing to ORM

Reverse Engineering to ORM

Ideally, you should always develop a database by forward engineering from ORM to a logical model and then to a physical database schema. This is feasible if you are developing a new database from scratch. In practice, however, you might also need to modify existing databases that might not have been forward engineered from ORM. These databases might also be populated with large volumes of data.

If the database was originally forward engineered from ORM, but changes were later made directly to the physical database, you can reverse engineer those changes to the logical model by refreshing it (see Chapter 16), and then migrate those changes to the ORM source model. If the database was not generated from ORM, you may reverse engineer the database schema directly to a draft ORM schema, and then make any needed changes to clean it up conceptually.

Once you have a correct ORM schema for the existing database, you may add the desired functional modifications, and then forward engineer the changes back to the database by building the logical model and using it to update the database automatically. Details on refreshing, updating, and synchronizing models are provided in Chapter 16. Alternatively, you can forward engineer the ORM model to a logical model and generate a fresh physical database; if the original database was populated with data, you may manually migrate the data to the new database.

The tool can reverse engineer an existing physical database schema to a fresh ORM schema or a fresh logical database schema (see Figure 8-1). However it cannot further reverse engineer a logical database model created in this way to an ORM schema. This chapter discusses reverse engineering to a new ORM schema. Reverse engineering to a logical database schema is discussed in chapter 14.

Figure 8-1: Two options for reverse engineering a physical database schema.

To reverse engineer an existing database, an Open Database Connectivity (ODBC) data source is needed to enable access to that database. Visio refers to a data source by a data source name (DSN). If the data source does not exist, you can create it using either the Windows ODBC Data Source Administrator dialog or Visio's Reverse Engineering Wizard.

Creating an ODBC Data Source

To create a new data source using the Windows ODBC Data Source Administrator dialog, proceed as follows . The ODBC Data Source Administrator dialog is accessible from the Windows Control Panel. For example, to open this dialog in Windows XP, click Start > Control Panel > Performance and Maintenance > Administrative Tools and then double-click Data Sources (ODBC) . On the User DSN pane, press the Add button to open the Create New Data Source dialog. Select the relevant ODBC driver (e.g., Microsoft Access Driver, or SQL Server) and press the Finish button. In the driver-specific dialog box that opens, type a name and description for the data source, browse to select the relevant database, and then press OK . Press OK to close the ODBC Data Source Administrator dialog.

For example, the Visio product includes a sample Microsoft Access database named Championzone. This is stored in the database samples folder (typically c:Program FilesMicrosoft OfficeVisio 101033SamplesDatabase). If you select Microsoft Access as the driver, you can add a data source for this database by adding a data source name (e.g., ChampionzoneDB) and description in the ODBC Microsoft Access setup dialog, and pressing the Select button to browse and select this database file (see Figure 8-2). For illustration purposes, the example used in the following discussion of reverse engineering assumes that you have created this data source.

Figure 8-2: Adding a new data source for a Microsoft Access database.

Reverse Engineer a Database to ORM

To reverse engineer a database to ORM, open Visio's ORM Source Model solution (e.g., File > New > Database > ORM Source Model ), and then open the Reverse Engineering Wizard by choosing Database > Reverse Engineer from the main menu. Now select the database driver for the database you wish to reverse engineer (see Figure 8-3).

Figure 8-3: Choosing a database driver in the Reverse Engineering wizard.

Press the Setup button to associate the Visio driver with the relevant ODBC driver and set any desired preferences and default data source. For example, if you chose Microsoft Access as the installed Visio driver, you could use the settings in Figure 8-4 to associate this with the Microsoft Access ODBC driver and choose the Championzone database as your default data source for this driver ( assuming you had created this data source earlier as discussed).

Figure 8-4: Setting up the database driver and default data source.

Press OK to close this dialog, then select the relevant data source from the data sources field of the Reverse Engineering wizard (see Figure 8-5).

Figure 8-5: Selecting the relevant data source.

If you had not previously created the data source, you may do so now by pressing the New button, choosing User Data Source from the next dialog, selecting the driver from the next dialog, pressing Finish , then adding a data source name and description and choosing the relevant database, as described earlier (e.g., see Figure 8-2).

Press the Next button to open the Connect Data Source dialog, enter any user name and password details required, and press OK (see Figure 8-6). If you changed the driver, a notice appears indicating that the new driver will be used by default until you change it again.

Figure 8-6: Connecting to a data source.

The Reverse Engineer wizard now displays a dialog that allows you to choose which types of relational database element to reverse engineer (see Figure 8-7). By default, all element types are selected. Microsoft Access does not support triggers or stored procedures, so those options are disabled when reverse engineering from an Access database. Make your selection, then press Next to continue.

Figure 8-7: Selecting which kinds of model element to reverse engineer.

The next dialog allows you to select which individual model elements to reverse engineer. Base table names are prepended by "T" and view names are prepended by "V" (see Figure 8-8). To select all of them, press the Select All button . To select only some, select the relevant check boxes. If triggers or stored procedures are available, these also appear for selection. After making your selection, press Next if you wish to review your selections, otherwise press Finish .

Figure 8-8: Selecting which individual elements to reverse engineer.

The results of the reverse engineering are displayed in the Output window. If any errors occur, correct these and then rerun the wizard. If the reverse engineering succeeds, the resulting ORM fact types are displayed in the Fact Types pane of the Business Rules window, as shown in Figure 8-9. Fact types derived from views are displayed with a double asterisk "**" (derived and stored), which is not strictly correct, since views are typically not materialized.

Figure 8-9: Reverse engineered fact types.

To view the object types, open the Object Types pane of the Business Rules window (see Figure 8-10).

Figure 8-10: Reverse engineered object types.

Displaying the ORM Model Graphically

To obtain a diagram for a reverse engineered ORM model, it is better to drag object types onto the drawing window rather than dragging fact types out. In the business rules window, each object type is depicted as an entity type (solid circle with dark blue fill) or a value type ( dotted circle with light blue fill). After reverse engineering, each object type depicted as an entity type corresponds to a base table or view . If the reverse engineered schema is large, it should normally be displayed on many pages, each of which displays a submodel whose elements have high semantic cohesion.

After reverse engineering, you can ensure that each page displays strongly related elements by dragging onto that page just one object type, then use ShowRelationships to display its fact types. Unless the diagram already fills the page, apply ShowRelationships again to one or more object types that are connected to the original object type by a relationship with "FK" in its name . This ensures that the fact types displayed on that page correspond either to fact types stored in tables that are directly linked by foreign key references, or to subtype relationships between the tables.

For discussion purposes, Figure 8-11 displays six tightly coupled tables from the sample logical model for the Championzone database. Recall that ORM fact types capture the semantic connections between primary key and nonkey columns and that some ORM subtype- supertype connections may appear as foreign key references.

Figure 8-11: A highly cohesive, logical submodel of the Championzone database.

Figure 8-12 shows the result of dragging the Applicant and Application object types onto the drawing page, then choosing ShowRelationships from their context menus . This displays all the fact types in the Applicant and Application tables as well as some related foreign key connections. For tidiness, display of indexes has been turned off (using View > Layer Properties ). Here, each entity type corresponds to a table in the database. If a table has a simple primary key, this displays as the reference mode of the entity type (e.g., Person, Product and AppStatus).

Figure 8-12: Applying ShowRelationships to two reverse engineered object types.

If a table has a composite key, and all the key fact types are displayed, then the composite key constraint appears as a primary, external uniqueness constraint (circled "P"), as shown in the reference scheme for Application. The identification scheme for Test is not displayed because one of its reference types is not displayed (you could display this by applying ShowRelationships to Test).

If a column is neither a simple primary key nor a foreign key, its underlying fact type is displayed as a relationship from the table entity type to a value type of that name (e.g., companyName, applicationDate, certificationLevel).

Any fact type underlying a foreign key is displayed with a relationship name that has "FK n " appended to its name, where n is a positive integer.

Refining the ORM Schema

In practice, any draft ORM schema obtained by reverse engineering usually needs many refinements. For example, Figure 8-13 shows the result of applying several improvements to the schema fragment in Figure 8-12. All these changes can be made using techniques already discussed.

Figure 8-13: Refining the ORM schema in Figure 8-12.

Several changes have been made to the names of predicates, object types, and reference modes, in order to make the semantics easier to understand. Some value types have also been changed to entity types. Reverse engineered role names may be retained if the same column name mapping is required. To ensure the logical model names are unaltered when you forward engineer the new ORM schema you will often need to configure various naming options as explained in the previous chapter.

The change from "applicationDate" to "Date" requires more discussion. In the reverse-engineered global schema, there are several date object types (applicationDate, examDate, birthDate, registrationDate, etc.). These should all be replaced by the object type Date, with the reverse-engineered name retained as a role name. Apart from simplifying the schema, this merging of object types ensures that if ever we make any changes to the data type for Date, this will apply universally .

Where relevant, reverse-engineered injective (1:1 into) identifying relationships should be replaced by subtyping (e.g., the relationship between Applicant and Person). The tool does not generate nesting as an alternative to co-referencing when it reverse engineers . When it makes the schema easier to understand, awkward co-referenced object types should be replaced by nested object types. If the original database involved views, you need to make any needed adjustments to prevent the reverse-engineered constructs leading to base tables when the schema is forward engineered.

Once the semantics of the as-is database schema have been clarified, you should look to see if some additional constraints need to be added to provide a more complete to-be model. As a trivial example, the value constraint {1..5} has been added to CertificationLevel in Figure 8-13.

Often, you may find that the database was badly designed in the first place, with problems such as uncontrolled denormalization and incompatible data types. Working in ORM, you will find it much easier to expose and correct these problems.

Once you are satisfied with the new ORM schema, and it passes model error checks, you may forward engineer it to the DBMS of choice by building a fresh logical model and then generating a new physical database schema from it. If the reverse engineered database included code such as view definitions, triggers or stored procedures you can transfer these across as discussed in Chapter 16. Chapter 16 also discusses how to update an existing database with changes made to a model.

If the original database was already populated with data, you may perform data migration to the new physical database once it is generated. This process can be tedious but is usually not very challenging to do manually. Various DBMSs include facilities to assist in this process (e.g., SQL Server's Data Transformation Services).

Once the new database is generated and populated, you should not need to perform reverse engineering on it again. As far as possible, make any subsequent changes at the ORM level and propagate them downwards. If it is necessary to make changes at lower levels, the tool allows you propagate the changes to the other levels to keep your models in sync. This topic is discussed in Chapter 16.

Importing VisioModeler and ERX Files

In addition to reverse engineering physical databases, the tool can import ORM models and logical models that were built using VisioModeler or InfoModeler (version 1.5 onwards). You may also import logical models built using Computer Associates ERwin tool, if they have been saved as ERX files (up to version 3.5.2).

To import such a model, open the ORM Source Model solution, choose the menu option Database > Import/Export and then select the relevant import option (see Figure 8-14). For VisioModeler files, options for ORM models, logical models, and dictionary files are distinguished by the relevant suffix (.IMO, .IML, .IMD, respectively).

Figure 8-14: Importing VisioModeler or ERwin ERX models.

When the import dialog appears, either type the path and file name for the desired model, or press the Browse button to browse for and select the relevant file. Then press OK to import the file.

The progress of the import is displayed in the Output window. Once the import is complete, open the Business Rules window, and drag the relevant object types or fact types onto the drawing window. As for reverse engineering, it is usually better to drag the relevant object types onto the desired drawing page, then right-click the object types and apply ShowRelationships. The tool is not capable of importing the original layout of the model elements, but by using ShowRelationships, you can usually layout the model yourself fairly quickly.

You can also import a logical model directly into an ER Source Model document using the same menu option. In this case, after the import you can drag the relevant tables onto the drawing surface, and use ShowRelationships on tables to display their foreign key connections.

As indicated in Figure 8-14, the tool also allows you to export to ERX models. However recent versions of ERwin no longer use the ERX format, so this may be of limited use.

If you need to construct an ORM model from an ERwin model that is stored in a newer format than ERX, you should first use ERwin to generate a physical database from the model, then open the ORM Source Model solution and use its reverse engineering facility to extract the model from the physical database.

Категории