Accessing and Analyzing Data with Microsoft Excel (Bpg-Other)
Data appears not only in data lists, but also in tables, groups of records separated by characters (such as a comma), and so on. When a group of data records (for example a single data list, a data table, or a file of character-separated data values) exists on its own without relying on another group of data records, the data is said to be nonrelational. When a group of records relies on other groups of records for complete, detailed business facts, the data is known as relational. Relationships are usually defined on a one-record-to-one-record basis or on a one-record-to-many-records basis. (These relationships are also known as one-to-one relationships and one-to-many relationships.)
In the terms of data analysis, multidimensional data is different from relational data. The concepts used in working with relational data are vastly different from those used to work with multidimensional data. While relational data is usually stored in a file-based or server-based database management system, multidimensional data usually resides in an online storage location such as a data warehouse, a central repository for capturing large amounts of data records.
Working with Relational Data
You work with relational data when you bring together business facts that are spread out over multiple, related data tables in order to analyze sets of values, fields, or records. Relational data lends itself well to data analysis techniques such as sorting, filtering, and querying. You can also summarize relational data, but because of the time it takes to create summarizations, if you are working with thousands or millions of data records, you should consider moving the data to a multidimensional format, which I’ll discuss in more detail in the next section.
Your Turn
In this exercise, you will explore the difference between relational and nonrelational data.
-
Start Microsoft Access.
-
Open the
Relation.mdb file in the Chap02 folder. -
On the View menu, point to Database Objects and then click Tables.
-
In the Database window, double-click the Nonrelational Data icon. Notice in Figure 2-10, which shows the Nonrelational Data table, that many addresses are repeated.
Figure 2-10: The Nonrelational Data table. -
Close the Nonrelational Data table.
-
In the Database window, double-click the Relational Data–Customer Info icon. Notice in Figure 2-11 that in this table each address appears only once. This illustrates good relational database design: if you need to change a particular customer’s information, you need to change it in only one place and for only one record in the Relational Data–Customer Info table.
Figure 2-11: The Relational Data–Customer Info table. -
Close the Relational Data–Customer Info table.
-
In the Database window, double-click the Relational Data– Receipts icon. Notice here that the addresses are not displayed; you see only the receipt information, shown in Figure 2-12. Customer information is related to receipts by customer number.
Figure 2-12: The Relational Data–Receipts table. -
Close the Relational Data–Receipts table.
-
On the Tools menu, click Relationships. In the Relationships window, notice that the Relational Data–Customer Info table is related to the Relational Data–Receipts table by customer number. (See Figure 2-13.) What this means is that for each receipt in the Relational Data—Receipts table, there must be a customer number in the Relational Data–Customer Info table. This also means that if you remove a customer’s information from the Relational Data–Customer Info table, Access might prevent you from doing so if removing the information would result in orphaned records (records with no matching records in another table) in the Relational Data–Receipts table.
Figure 2-13: Two tables related by Customer Number.
Relational data reduces duplicate data values. You will learn more about relational data in Chapter 5.
Working with Multidimensional Data
In contrast to relational data, you should use a multidimensional data format when you want to look at overall data summarizations or trends, especially when you need to analyze thousands or even millions of data records. The more time an organization invests in planning its specific multidimensional data formats, the more data summarizations the organization should be able to produce.
Note | NoteThese steps do not work in Excel 2000. |
Your Turn
In this exercise, you will explore data in a multidimensional format.
-
Start Excel.
-
On the File menu, click Open.
-
In the Files Of Type list, select All Files.
-
Locate and open the
Sales.cub file in the Chap02 folder. A PivotTable report appears. -
From the PivotTable Field List window, drag the Product icon to the Drop Page Fields Here area of the PivotTable report.
-
From the PivotTable field list, drag the Customers icon to the Drop Row Fields Here area of the PivotTable report.
-
Drag the Profit icon to the Drop Data Items Here area of the PivotTable report. Your report layout should look similar to the report layout shown in Figure 2-14.
Figure 2-14: PivotTable report layout. -
In cell A4, click the arrow, click the plus symbol (+) next to the USA check box, select the check boxes for CA, OR, and WA, and then click OK. Profits are displayed for all three states. Check your progress against Figure 2-15.
Figure 2-15: PivotTable report showing profits for all three states. -
In cell B1, click the arrow, click the plus symbol next to All Products, click Drink, and then click OK. Profits are displayed for just drink products, as you can see in Figure 2-16.
Figure 2-16: PivotTable report showing profits for drink products.
Multidimensional data allows you to perform faster data summarizations. You will learn more about multidimensional data in Chapters 7 through 9.
Категории