Business Information Warehouse for SAP (Prima Techs SAP Book Series)
A star schema is the most common form of a multidimensional model. In the center of a star schema is the fact table connected to several dimensions that business users use to slice and dice data.
Look at the SAP Sales and Distribution demo InfoCube data model, as shown in Figure 12-2.
On the left is a star schema for the Sales and Distribution InfoCube. It has nine dimensions and four key measurements. Each dimension is connected to the fact table via its unique ID. A record in the fact table is uniquely identified by the keys of the dimension tables and becomes a multi-part primary key. In this example, to make a unique fact record there are nine dimension keys, each representing a dimension. In database terminology, keys of the dimension tables are foreign keys in the fact table.
The Data Warehouse Toolkit by Ralph Kimball is an excellent source for SAP BW InfoCube designers to understand common multidimensional modeling concepts. It provides data modeler information on multidimensional modeling techniques. However, multidimensional data models in SAP BW differ from most industry-accepted reference star schema models. One such difference between SAP BW and the rest of the industry star schemas is that master data is shared across all InfoCubes defined in SAP BW. When modeling in SAP BW this feature-shared master data-alone plays a major role in modeling information objects that are not too complex and balanced for navigation to provide overall analytical applications performance. Because SAP BW is built on top of proven multitiered architecture, one must understand the deployment architecture of analytical applications by exploiting SAP distributed technologies instead of technologies specific to DBMS. The data models have to accommodate such dynamic distribution of information objects across the extraprise. Note that Kimball's book does not discuss ODS design or data warehouse modeling concepts. It discusses multidimensional modeling at great length, and I recommend that SAP BW modelers read this book.
Defining Dimensions
In a traditional star schema model, the dimensions contain reference data to hold business descriptions for analysis; for example, Customer, Material, and Sales Organization. Dimensions are specific to a fact table; therefore, reference data such as master data and hierarchy is not shared across the cubes. In SAP BW, reference data is stored once and shared across all InfoCubes. This reduces data redundancy.
An InfoCube can have a total of 16 dimensions. Three are reserved for Time, Unit of Measure, and Packet ID. The 13 other dimensions are user definable.
To model dimensions, keep the following points in mind:
-
Understand how business people want to analyze data.
-
Understand the possible navigation paths to view data.
-
Identify the attributes associated with a dimension. For example, the Customer dimension has attributes such as Name, City, State, Zip Code, and so on. You can have up to 248 attributes for a dimension.
-
Normalize dimensions. Do not mix groups of variables in one dimension. For example, do not include Customer and Material attributes in one dimension. Build two separate dimensions: one for Customer and another for Material.
Note Though dimensions and hierarchies look the same, they are different entities. Dimensions have no common levels, and a query may or may not contain each dimension to fetch dimensions. The hierarchy must have at least one common level, and queries can have at most one hierarchy.
-
Dimensions tend to be time independent. However, it is quite possible that dimensions also change by passage of time. For example, organizations or cost center structures may change from time to time. The terminology used to define such time dependency is slowly changing dimensions. This subject is discussed later in this chapter.
-
Time and Unit of Measure are the most common dimensions in just about all OLAP cubes. After all, the purpose of analysis always refers to a specific time period, such as current year versus past year. For this reason, SAP BW has a predefined time dimension.
-
Dimension attributes used for frequent drill-down analysis must be modeled in the dimension table. This improves overall data navigation performance.
Defining Facts
Facts contain business measures uniquely associated with a set of predefined dimensions. The number of entries in a fact table is very large compared to that of individual dimension tables. The size of the fact table depends on the dimension attributes in the dimensions. A fact table can have several millions to billions of entries. For example, if you want to capture order details up to individual line item levels, the fact table has several entries based on line items in an order. Keep the following guidelines in mind when defining a fact table:
-
Keep the key figures to atomic level-mostly raw, numeric items relevant to measures only.
-
Do not store those computed values that can be derived easily at runtime. For example, product cost can be derived at run time by multiplying price per unit times number of units sold. This saves time loading the data in the fact tables.
-
Fact tables are accessed through dimensions.
-
You also can define a factless fact table. This means that such a fact table contains no measure to query dimension. For example, to model a manufacturing plant's cost effectiveness in shipping products, you define dimensions such as Material, Plant, Time, Unit of Measure, Ship to Zip Code, and the fact table that has only one dummy measure, Existence, set to value 1 as a fact. Then by simply slicing and dicing dimensions, you get counts of the dimension intersections, as shown in Figure 12-3; for example, you could obtain the number of products shipped from a manufacturing plant to customers living in Boston during the last two years.
Figure 12-3: Factless Fact Table. It Contains No Measures.
Team-Fly |