Aggregates and Multi-Cubes
Most users need to access only a subset of information in an InfoCube. In Section 6.2, for example, the East region users were not allowed to access other regions' data. In that case, we could create a new InfoCube, which contained only the East region sales data, a subset of the original InfoCube. Because the new InfoCube is smaller, the required disk I/O volume during query execution will be smaller, too, and the East region users would therefore see improved query performance.
SAP implements this idea in BW and calls the new InfoCube an aggregate. An InfoCube can have multiple aggregates, and the aggregates are transparent to users. It means that we create queries upon InfoCubes, not aggregates. For a query run or a navigation step, the BW OLAP processor is responsible for selecting an appropriate aggregate. If no appropriate aggregate exists, the BW OLAP processor will retrieve data from the original InfoCube.
Different from our need for aggregates, which are subsets of InfoCubes, we also often need to combine data from multiple InfoCubes to do cross-subject analysis, such as from sales to delivery. But from Chapter 5, we know that queries can be created on only one InfoCube. To overcome this limit using the techniques we have learned so far, we must build a larger InfoCube that contains both sales and delivery data.
Suppose we have only one delivery agent, DAGE02 (characteristic IO_DAGE), that can deliver only one unit per day. In that case, the combined sales (from Table 3.3) and delivery data should be as shown in Table 8.1.
IO_CUST | IO_SREP | IO_MAT | IO_PRC | 0UNIT | IO_QUAN | IO_REV | 0CALDAY (Sales) | IO_DAGE | 0UNIT | IO_DQUAN | 0CALDAY (Delivery) |
---|---|---|---|---|---|---|---|---|---|---|---|
CUST001 | SREP01 | MAT001 | 2 | CS | 1 | 19980304 | DAGE02 | X | 1 | XX | |
CUST002 | SREP02 | MAT002 | 2 | CS | 2 | 19990526 | DAGE02 | X | 1 | XX | |
1 record here, with the same data as the above record except the delivery date. | |||||||||||
CUST002 | SREP02 | MAT003 | 5 | CS | 3 | 19990730 | DAGE02 | X | 1 | XX | |
2 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST003 | SREP03 | MAT003 | 5 | CS | 4 | 20000101 | DAGE02 | X | 1 | XX | |
3 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST004 | SREP04 | MAT004 | 50 | EA | 5 | 19991023 | DAGE02 | X | 1 | XX | |
4 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST004 | SREP04 | MAT005 | 100 | EA | 6 | 19980904 | DAGE02 | X | 1 | XX | |
5 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST004 | SREP04 | MAT005 | 100 | EA | 7 | 19980529 | DAGE02 | X | 1 | XX | |
6 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST005 | SREP05 | MAT006 | 200 | EA | 8 | 19991108 | DAGE02 | X | 1 | XX | |
7 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST006 | SREP06 | MAT007 | 20 | EA | 9 | 20000408 | DAGE02 | X | 1 | XX | |
8 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST007 | SREP07 | MAT008 | 3 | DZ | 10 | 20000901 | DAGE02 | X | 1 | XX | |
9 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST007 | SREP07 | MAT008 | 3 | DZ | 1 | 19990424 | DAGE02 | X | 1 | XX | |
CUST008 | SREP08 | MAT008 | 3 | DZ | 2 | 19980328 | DAGE02 | X | 1 | XX | |
1 record here, with the same data as the above record except the delivery date. | |||||||||||
CUST008 | SREP08 | MAT009 | 2 | CS | 3 | 19980203 | DAGE02 | X | 1 | XX | |
2 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST008 | SREP08 | MAT010 | 1 | LB | 4 | 19991104 | DAGE02 | X | 1 | XX | |
3 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST009 | SREP09 | MAT011 | 1.5 | LB | 5 | 20000407 | DAGE02 | X | 1 | XX | |
4 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST010 | SREP10 | MAT011 | 1.5 | LB | 6 | 20000701 | DAGE02 | X | 1 | XX | |
5 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST010 | SREP10 | MAT011 | 1.5 | LB | 7 | 19990924 | DAGE02 | X | 1 | XX | |
6 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST010 | SREP10 | MAT012 | 2 | LB | 8 | 19991224 | DAGE02 | X | 1 | XX | |
7 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST010 | SREP10 | MAT013 | 3 | CS | 9 | 20000308 | DAGE02 | X | 1 | XX | |
8 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST011 | SREP10 | MAT014 | 1 | LB | 10 | 19980627 | DAGE02 | X | 1 | XX | |
9 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST012 | SREP11 | MAT014 | 2 | LB | 1 | 19991209 | DAGE02 | X | 1 | XX | |
CUST012 | SREP11 | MAT015 | 3 | CS | 2 | 19980221 | DAGE02 | X | 1 | XX | |
1 record here, with the same data as the above record except the delivery date. | |||||||||||
CUST012 | SREP11 | MAT015 | 2 | CS | 3 | 20000705 | DAGE02 | X | 1 | XX | |
2 records here, with the same data as the above record except the delivery dates. | |||||||||||
CUST012 | SREP11 | MAT015 | 3.5 | CS | 4 | 20001225 | DAGE02 | X | 1 | XX | |
3 records here, with the same data as the above record except the delivery dates. |
This approach has a drawback: The InfoCube containing the sales and delivery data will be very large and the query performance will consequently be very slow. Also, imagine what would happen if we needed to do cross-subject analysis from purchase, to inventory, to sales, to delivery, and to billing. The InfoCube would become so large that we could not manage it, and query performance would degrade to such a level that we could not receive a report in an acceptable response time, even using aggregates.
To resolve this problem, BW offers a technique called the multi-cube. The InfoCubes we discussed previously are called basic cubes. The multi-cube contains no data, but rather simply links the basic cubes together. We can create queries on a multi-cube just as we did on a basic cube.
With the multi-cube technique, one basic cube is recommended to cover one subject area only. One benefit of this approach is that we save disk space. It should be easy for us to estimate the saved disk space after taking out the repeated sales data from Table 8.1, which is the data in 96 (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 1 + 2 + 3) rows of these 6 columns:
- IO_SREP
- IO_PRC
- 0UNIT
- IO_QUAN
- IO_REV
- 0CALDAY (Sales)
In this chapter, we will demonstrate how to create and use aggregates and multi-cubes.