Aggregates and Multi-Cubes

Most users need to access only a subset of information in an InfoCube. In Section 6.2, for example, the East region users were not allowed to access other regions' data. In that case, we could create a new InfoCube, which contained only the East region sales data, a subset of the original InfoCube. Because the new InfoCube is smaller, the required disk I/O volume during query execution will be smaller, too, and the East region users would therefore see improved query performance.

SAP implements this idea in BW and calls the new InfoCube an aggregate. An InfoCube can have multiple aggregates, and the aggregates are transparent to users. It means that we create queries upon InfoCubes, not aggregates. For a query run or a navigation step, the BW OLAP processor is responsible for selecting an appropriate aggregate. If no appropriate aggregate exists, the BW OLAP processor will retrieve data from the original InfoCube.

Different from our need for aggregates, which are subsets of InfoCubes, we also often need to combine data from multiple InfoCubes to do cross-subject analysis, such as from sales to delivery. But from Chapter 5, we know that queries can be created on only one InfoCube. To overcome this limit using the techniques we have learned so far, we must build a larger InfoCube that contains both sales and delivery data.

Suppose we have only one delivery agent, DAGE02 (characteristic IO_DAGE), that can deliver only one unit per day. In that case, the combined sales (from Table 3.3) and delivery data should be as shown in Table 8.1.

Table 8.1. SALES (FROM TABLE 3.3) AND DELIVERY DATA

IO_CUST IO_SREP IO_MAT IO_PRC 0UNIT IO_QUAN IO_REV 0CALDAY (Sales) IO_DAGE 0UNIT IO_DQUAN 0CALDAY (Delivery)
CUST001 SREP01 MAT001 2 CS 1   19980304 DAGE02 X 1 XX
CUST002 SREP02 MAT002 2 CS 2   19990526 DAGE02 X 1 XX
1 record here, with the same data as the above record except the delivery date.
CUST002 SREP02 MAT003 5 CS 3   19990730 DAGE02 X 1 XX
2 records here, with the same data as the above record except the delivery dates.
CUST003 SREP03 MAT003 5 CS 4   20000101 DAGE02 X 1 XX
3 records here, with the same data as the above record except the delivery dates.
CUST004 SREP04 MAT004 50 EA 5   19991023 DAGE02 X 1 XX
4 records here, with the same data as the above record except the delivery dates.
CUST004 SREP04 MAT005 100 EA 6   19980904 DAGE02 X 1 XX
5 records here, with the same data as the above record except the delivery dates.
CUST004 SREP04 MAT005 100 EA 7   19980529 DAGE02 X 1 XX
6 records here, with the same data as the above record except the delivery dates.
CUST005 SREP05 MAT006 200 EA 8   19991108 DAGE02 X 1 XX
7 records here, with the same data as the above record except the delivery dates.
CUST006 SREP06 MAT007 20 EA 9   20000408 DAGE02 X 1 XX
8 records here, with the same data as the above record except the delivery dates.
CUST007 SREP07 MAT008 3 DZ 10   20000901 DAGE02 X 1 XX
9 records here, with the same data as the above record except the delivery dates.
CUST007 SREP07 MAT008 3 DZ 1   19990424 DAGE02 X 1 XX
CUST008 SREP08 MAT008 3 DZ 2   19980328 DAGE02 X 1 XX
1 record here, with the same data as the above record except the delivery date.
CUST008 SREP08 MAT009 2 CS 3   19980203 DAGE02 X 1 XX
2 records here, with the same data as the above record except the delivery dates.
CUST008 SREP08 MAT010 1 LB 4   19991104 DAGE02 X 1 XX
3 records here, with the same data as the above record except the delivery dates.
CUST009 SREP09 MAT011 1.5 LB 5   20000407 DAGE02 X 1 XX
4 records here, with the same data as the above record except the delivery dates.
CUST010 SREP10 MAT011 1.5 LB 6   20000701 DAGE02 X 1 XX
5 records here, with the same data as the above record except the delivery dates.
CUST010 SREP10 MAT011 1.5 LB 7   19990924 DAGE02 X 1 XX
6 records here, with the same data as the above record except the delivery dates.
CUST010 SREP10 MAT012 2 LB 8   19991224 DAGE02 X 1 XX
7 records here, with the same data as the above record except the delivery dates.
CUST010 SREP10 MAT013 3 CS 9   20000308 DAGE02 X 1 XX
8 records here, with the same data as the above record except the delivery dates.
CUST011 SREP10 MAT014 1 LB 10   19980627 DAGE02 X 1 XX
9 records here, with the same data as the above record except the delivery dates.
CUST012 SREP11 MAT014 2 LB 1   19991209 DAGE02 X 1 XX
CUST012 SREP11 MAT015 3 CS 2   19980221 DAGE02 X 1 XX
1 record here, with the same data as the above record except the delivery date.
CUST012 SREP11 MAT015 2 CS 3   20000705 DAGE02 X 1 XX
2 records here, with the same data as the above record except the delivery dates.
CUST012 SREP11 MAT015 3.5 CS 4   20001225 DAGE02 X 1 XX
3 records here, with the same data as the above record except the delivery dates.

This approach has a drawback: The InfoCube containing the sales and delivery data will be very large and the query performance will consequently be very slow. Also, imagine what would happen if we needed to do cross-subject analysis from purchase, to inventory, to sales, to delivery, and to billing. The InfoCube would become so large that we could not manage it, and query performance would degrade to such a level that we could not receive a report in an acceptable response time, even using aggregates.

To resolve this problem, BW offers a technique called the multi-cube. The InfoCubes we discussed previously are called basic cubes. The multi-cube contains no data, but rather simply links the basic cubes together. We can create queries on a multi-cube just as we did on a basic cube.

With the multi-cube technique, one basic cube is recommended to cover one subject area only. One benefit of this approach is that we save disk space. It should be easy for us to estimate the saved disk space after taking out the repeated sales data from Table 8.1, which is the data in 96 (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 1 + 2 + 3) rows of these 6 columns:

In this chapter, we will demonstrate how to create and use aggregates and multi-cubes.

Категории