Modeling Your Database

When you set out to design a relational database, you need to do a little upfront planning. You need to decide what entities you'll be tracking which ones deserve a table, and how they relate to one another. It's easier to create the right tables and connections the first time than to go back and change them. You can create your own "blueprint" to follow as you build your databasewhat the pros call an entity relationship diagram.

7.2.1. Choosing Entities

Your first step in designing the database is deciding what tables you need and how they fit together. Since every table holds data about a single entity, you normally start by figuring out all the entities in your system. You probably won't be able to list them all in one shot. Everybody forgets some that are less obvious, so start with blank paper or a word processor, and list all the things the database will need to do. This list will help you identify entities.

Now's the time to stretch your mind and think of every possibility. What tasks do you do every day? What do you wish you could doand what information do you need to do it? What do you want your computer to show you when you sit down first thing in the morning?

When your workflow hits a wall, what piece of information would get you moving again? What questions do people keep asking youand how could FileMaker answer them for you? The more your initial plan matches your real needs, the more quickly you'll be up and running. You can see a list like this in Figure 7-3.

Figure 7-3. Here's a list of things your database will do. A list like this helps you start figuring out your entities and possible tables. A more complex database could have a much longer list. Some big systems start with a list several pages long!

With this list in hand, you can start to figure out what entities your database needs to track. For each item on your list, think of all the things it involves. Figure 7-4 shows a possible list.

Most real-world problems have lots and lots of entities, so the second step is to figure out which ones matter. You'll use a process of elimination to remove extraneous items from your list. You might eliminate items for any of four major reasons:

Figure 7-5. After thinking about each entity on your list, you can remove some and possibly add some new ones. When you're through with this step, you have a pretty good list of the entitiesand tablesin your database.

 

7.2.2. Finding Relationships

Now that you have a list of entities, you need to figure out how they relate to one another. To get started, just pick two of your entitiesCustomers and Jobs, for exampleand ask yourself how they go together (if you need some guidance, see the box on Section 7.2.3.) You might come up with this answer: A customer hires me to do jobs, and a job is done for a customer. That sentence tells you two important things:

By comparing different entities in this way, you can figure out how each relates to another, if at all. Your notes as you consider these relationships might look something like Figure 7-7.

You can easily translate most of the items on that list into database relationships. Each tells you about two entities in your list, and how they relate in each direction. You can usually translate these directly into database-ese, as shown in Figure 7-6. That picture is one example of a diagram representing each of these relationships.

Figure 7-6. In this picture, boxes represent entities, and the lines between them indicate relationships. The little forked end on the relationship lines (called a crow's foot) means "to-many" as in "one-to-many." If the line doesn't have a crow's foot, it's to-one. So the relationship between Payments and Invoices is many-to-many, while that between Customers and Invoices is one-to-many.

A glance at Figure 7-6 shows you how entities like Payments, Invoices, and Customers relate to one another, but the relationship with the Time entity isn't so obvious. Is "Time" plural? For that matter, if Time is an entity, then it must be a thing, so what is a time? You've just discovered one of the common challenges to good relational designchoosing good names.

You added Time to your list of entities because you spend time working on a job. That's a little ambiguous, though, so think about what exactly you'll be putting in the database. You'll be logging the time you spend working: what you're doing, when you started, and when you finished. You could call it a work log entry but that's pretty cumbersome. Because this kind of entity is quite common, database types have made up a name for ittimeslip. A timeslip is sort of like one entry on a timesheet. It says what you were doing for one period of time.

Using this language, your relationship description becomes clearer:

Now it's a lot more obvious: This relationship is one-to-many.

Figure 7-7. Here's a list of relationships between entities on your list. (You may have worded things differently, or come up with some that aren't on this list.) If you follow the advice in the box on Section 7.2.1, these sentences will usually exactly describe each kind of relationship.

 

7.2.2.1. One-to-many relationships

Most of the relationships in your diagram are one-to-many, which is normal. One-to-many relationships outnumber all other types by a large margin in almost any system. See the box "Many for One and One for Many" on Section 7.2.1 for a description of the various types of relationships.

7.2.2.2. One-to-one relationships

Your list of entities and relationships shows a one-to-one relationship between Customers and Addresses. For the purposes of this database, one Customer can certainly have one Address, and vice versa. But if that's the case, are they really separate entities? In fact, Address is just an attribute of the Customer entity. That makes it a prime candidate for entity-elimination. Put the address fields in the customer table instead. You might argue that despite the conflict of interest, you could work for two people in the same household and would therefore have to type the same address twice in your Customer table if you didn't have an Address table. The best answer to an argument like that is: Big deal. This situation won't arise often enough to justify a more complicated database just to eliminate duplicating one or two addresses. Even without a separate Address table, you can still separately handle all other tasks for these two clients.

On the other hand, if you're managing a high-school, and it's important to know which students share a home, and which parents they belong to, then an Addresses (or more likely, Households) entity might make sense.

FREQUENTLY ASKED QUESTION

Only One Attribute

So, I'm supposed to eliminate Services as an entity because it has only one attribute (Section 7.2). Is that another rule? Should I remove all entities that have just one attribute?

Not necessarily. There are times when an entity has only one attribute, but you still want to give it a table. For one thing, tables can help promote consistent data entry. Say your new database is going to track Jobs, among other things. A job may have just one attributethe Job Nameso you could just put the job name on each expense, time entry, and invoice. But what if you misspell the job name? Then you'll have expenses that should go together but don't, because FileMaker sees different job names. By creating a Jobs table, you have a central place to enter job names and use it as a value list. With a Jobs table, instead of typing the job name everywhere, you simply choose it from a list.

Wait a minute. All you really need to do that is a list. Why have a table if it has just one column?

OK, wise guy. You could leave out the Jobs table and the job names themselves. Instead, you could just attach each expense, time log entry, or invoice to a customer. But what happens when the same customer hires you six different times? All those expenses, invoices, and entries are lumped together under that customer, and you lose the ability to track them back to the individual jobs. Sometimes a one-attribute entity exists because it gives your database greater powers of organization or categorization. (See the box on Section 6.2.7.1 for more detailsand a warningabout using a single-attribute table for a value list.)

In this case, the fact that two addresses are one and the same justifies your adding another table. As a general rule, unless you can articulate a good reason for its existence, a one-to-one relationship is just two tables where one would normally suffice. (For some clarification, see the box on Section 7.2.3.1.) You'll almost always want to combine entities like people and their addresses into one table.

7.2.2.3. Many-to-many relationships

Many-to-many relationships pose a special challenge. Normally, FileMaker knows which records are related to one another because they have something in common. For example, the Invoice table and the Line Item table both have an Invoice Number field. But you can put only one invoice number in a Line Item record. Luckily, Invoices and Line Items have a one-to-many relationship. If they had a many-to-many relationship, you'd be in trouble.

To fix things, you need to chop your many-to-many relationships in half, turning each into two separate one-to-many relationships. You always need to add a new special-purpose entity in the middle. Your database has only one un-handled many-to-many relationship: Invoices and Payments. To split it up, you need to introduce a special entity. Since it doesn't have a decent name, just call it Invoice Payment (as in "This record represents one invoice paymentone payment on one invoice"). Now, instead of "An invoice is paid with payments, and a payment is applied to invoices," you can say these two things:

UP TO SPEED

Think Relational

If you're dazed and confused trying to figure out how different entities relate, you're not alone. Understanding relational database design takes practice, plain and simple. Here are some ideas to improve your thought process:

  • Don't get hung up on technicalities. At this point in your design, you shouldn't be thinking about database tools like primary keys, foreign keys, or join tables. Those are all implementation details that you can work out later. Right now, just focus on the kinds of things you're keeping track of and how they fit together.
  • Use familiar words. If you're trying to figure out how customers and jobs should be related, say, "A customer hires me to do jobs," not "A Customer entity is related to a Job entity in a one-to-many configuration." As you get the hang of it, you'll discover that the simple sentences you use every day say a whole lot about relationships. For example, if a customer hires you to do jobs (note the plural), then you probably have a one-to-many relationship between customers and jobs.
  • Consider individual items first. Don't think about what customers do. Instead, think about what one customer does. That will tell you whether a single customer has many jobs, or just one job. Then turn it around. Once you've decided a customer hires you to do jobs, ask yourself what a job has to do with customers. "I do a job for a customer." This process tells you that each job is connected to just one customer. By combining these results, you discover that a customer has many jobs, while a job has just one customer. That means Customers and Jobs have a one-to-many relationship.
  • One last tip: Don't let the word "many" hang you up. It's just a standard term to help keep things simple. (Otherwise, you might have a one-to-quite-a-few relationship between Jobs and Invoices, a one-to-a-handful relationship between Customers and Jobs, and a usually-just-one-or-two-to-rarely-more-than-three relationship between Payments and Invoices. Yikes!) "Many" doesn't have to mean "lots. " It might mean "exactly six" or "no more than three. " The only time the relationship isn't one-to-many is when it's one-to-exactly-one.

Figure 7-8 shows the updated diagram.

7.2.3. The Entity-Relationship Diagram

Now that you have a list of entities and their relationships, you're ready to assemble your master plan: the entity relationship (ER) diagram. An ER diagram is a picture that shows all the entities in your database and the relationships between them. Unlike the diagram you've already drawn (in Figure 7-8), each entity appears only once in an ER diagram.

Figure 7-8. In this updated version of the diagram, the Customers and Addresses relationship has been removed, and the Invoice Payments join entity has been added. Now you have only one-to-many relationshipsperfect.

It has two purposes: to help you find relationships you missed, or relationships that don't belong, and to serve as a roadmap for your database. You'll use it when you actually create this database in FileMaker, and when you go back to make changes later. (Yes, you will get to use FileMaker again…pretty soon.)


Note: The diagram you're about to create isn't, in the most technical sense, a real ER diagram. The real kind deals with all kinds of technical details that simply don't matter in FileMaker. Nevertheless, some wisenheimer may chastise you for calling your beautiful picture an ER diagram.


7.2.3.1. Creating an ER diagram

When you assemble an entity relationship diagram, you must put all your entities and relationships together in one big picture. Your goal is to have each entity in the picture just once, and all the lines necessary to define the various relationships.

Your ER diagram is crucial to a successful database designing experience. You'll almost always find ways to improve your database when you put it in a picture. When you set out to actually build the database, the ER diagram guides you through the process. Finally, six years from now, when you need to add more to your database design, the ER diagram will bring youor your successorup to speed on how your database fits together.

You can start by making one box for each entity you've identified. Try to place the boxes on the page so that there's some open space in the middle where your lines can roam free. Then start drawing lines to represent each of the relationships you've come up with. For a simple database, you can usually get the lines in the picture without much difficulty. But creating a larger diagram without the right tools can be a real pain. If you work on paper, you end up starting half a dozen times before you get a good arrangement. If you use a typical drawing program (the drawing capabilities in Word, for example) then you spend copious hours reconnecting lines and entities, reshaping lines, and hand drawing crow's feet as you move things around. See the box on Section 7.2.3.1 for some suggestions to solve this problem.

WORKAROUND WORKSHOP

When to Go One-to-One

When two thingslike people and addresseshave a one-to-one relationship, it usually means you've got an entity you don't need (as in the example on Section 7.2.2.3). There are exceptions to every rule, though. Here are some of the reasons you may see a one-to-one relationship in a database:

  • FileMaker uses record locking to prevent two people from editing the same record at the same time. (Yes, Virginia, two people can use a database at the same timesee Chapter 18 to learn how.) Imagine you have a table of products and every time you sell something, you reduce the product's Quantity In Stock field by one. What happens if someone else is modifying the product description at the same time? FileMaker won't let you enter the inventory adjustment because the record is locked. Should you cancel the order? Allow the Quantity in Stock field to be wrong? Force your customer to wait while you ask your associate to get out of the record? A better solution is to put the inventory levels in a separate table. Each product has a Product record and an Inventory record, so two seemingly conflicting processes no longer lock each other out of your system.
  • Sometimes you'll want to divide data you store about an entity among several tables. A database of stock photography might include high-resolution photographs in container fields. These photos make the database very large. By putting the photos in one table, and the information about the photos in another, you make working with the information easier. You can back up the information to a CD, and the photographs to your high-capacity backup system. (This technique requires keeping each table in a separate filesomething you'll learn about in the next chapter.)
  • Imagine you have customers and employees. Employees don't place orders or make payments, and customers don't have a time sheet. But they both have addresses, phone numbers, and email addresses. What's more, you like to send Holiday cards to all of them every year. In a situation like this, you can create three tables: People, Customers, and Employees. The People table holds all the information that customers and employees have in commonnames, addresses, and so forth. Information unique to an employee (Social Security number, Hire Date, and such) goes in the Employee table, and information only a customer could love (Referral Source, Membership Level, and so on) stays in the Customer table. You would have a one-to-one relationship between People and Employees, and another between People and Customers. When it comes time to send those cards, you can print envelopes right from the People table, and get customers and employees in one shot

When you're done, you should have a single, unified diagram with each entity showing up only once, and every relationship indicated by a line.

POWER USERS' CLINIC

It's All About the Tools

If you plan on doing this more than once, you're best off buying a proper diagramming program. Two excellent choices are Microsoft Visio for Windows and OmniGraffle for Mac OS X. These tools understand ER diagrams. They can hook entities together with ease, draw crow's feet on your behalf (lots of programs just don't understand crow's feet), and keep everything connected as you tinker with the arrangement.

If you give Visio a run, don't be tempted by its built-in database diagramming featuresthey're too complex for FileMaker work. Instead, create a basic diagram and use Rectangles and Dynamic Connectors from the Basic Shapes library. With the Format Lines command, you can put crows feet on your lines.

Both of these tools do more than you'll need for your ER diagrams, but the time you save is well worth the expense (about $200 for Visio and $70 for OmniGraffle). You can find free trial versions at http://office.microsoft.com/visio, and www.omnigroup.com/graffle.

If you're an unrepentant cheapskate, here's a tip: Write the entity names on a piece of paper and cut out each one. Then arrange them on paper, draw lines, and see how it looks. You can slide the entity scraps around a few times to find a decent arrangement, and then commit the whole thing to a clean piece of paper.

When you're thinking about relationships with just pairs of tables, you don't get the big picture. The ER diagram shows you how everything comes together, and when that happens, you often discover tangles of relationships just like those in Figure 7-9. Tangles like these aren't inherently bad; they're just usually completely unnecessary. Take the first tangled groupExpenses, Jobs, and Customers. The diagram tells you that customers have jobs, jobs have expenses, and customers have expenses.

But in point of fact, you don't need all those lines to understand all the relationships, and neither does FileMaker. It turns out that the line between Customers and Expenses is entirely superfluous. Even if it weren't there, you could still see all the expenses charged to a certain customer. Just find all that customer's jobs first (by following the line from Customers to Jobs). Once you've found those, you can look at the expenses for each job. Since customers only incur expenses by way of jobs, you'll get exactly what you want. In other words, If two entities are connected by a path along relationship lineseven through other entitiesthen they're related as far as FileMaker's concerned. It can show you the Expenses for a Customer just as easily as it can show the Jobs for that customer. Figure 7-10 illustrates this concept.

When you're thinking about these implied relationships, pay attention to the crow's feet. If, when moving from one entity to another along the relationship lines, you ever go through a to-many relationship, then the larger implied relationship is itself to-many. This isn't just a clever trick; it's actually intuitive. If a customer has more than one job, and each job has expenses, then clearly a customer can have more than one expense.

Figure 7-9. This diagram (created in OmniGraffle) doesn't look all that badexcept for one thing. The area labeled Rat's Nest has a lot of lines between just a few tables. Expenses, Customers, and Jobs are all interrelatedthere is a relationship between Jobs and Customers, another between Customers and Expenses, and a third between Expenses and Jobs. If you look closely, the same situation exists for Customers, Jobs, and Invoices. In both cases, there's a better way.

Figure 7-10. In this picture, the solid lines represent relationships you've defined among the entities. The dotted lines show relationships you don't have to definethey are implied. As long as there is a pathany pathfrom one entity to another, they are related. You don't have to have a direct path.

When you make your ER diagram, you should get rid of redundancy in your relationships. In other words, remove lines that show direct relationships when the relationship is already implied by other entities and relationships. In your diagram, you can remove the relationship between Customers and Expenses. You can also axe one between Customers and Invoices because Customers can find their Invoices by way of Jobs. With this revision, the ER diagram now looks like Figure 7-11.

Figure 7-11. Here's the final ER diagram that you'll use as the roadmap for creating your database. Notice that every entity is related to at least one other entity in some way. Stray, unrelated entities are so rare in a real database system that if you see one in your diagram, you should check to make sure everything belongs.

Next, you need to figure out what fields your tables need to make the relationships work. But before you get to that step, take one last chance to look over your ER diagram. Keep these points in mind as you check for errors:

If you discover an entity that has no relationships, you may not need it in your database at all. Read back on Section 7.1 and see if that item might belong inside one of the other tables. Or if your diagram has two or more groups of related entities and no relationships between the groups, you might've forgotten to draw in a relationship, or again, you might have one or more entities that your system doesn't need. Go back and make sure you're clear on your one-to-many and many-to-many relationships.

7.2.4. Keys

In a diagram, you can show relationships by simply drawing a line. FileMaker, however, knows what records are related only because they have some data in common. It's time to look more closely at how that works. Take, for example, the typical Invoice table and its little brother, Line Items. The Invoice table probably contains fields like Due Date, Balance Due, and Terms; all attributes of the invoice itself. Then there's Invoice Number field. Unlike the other fields, it's a made-up number. It does one thingidentify an invoiceand does it very well. Without it, you and your customers might have conversations like this: "I need a refund on one of my invoices…you know, the big one…yeah, in February…right, with three items…no, the other one…." As soon as someone mentions an invoice number, though, everybody knows exactly which invoice to look at. More important, FileMaker knows which invoice it is.

The invoice number can do its job as a key field because it has three important characteristics:

Since it's a unique number, if you're talking about invoice #24601, and your customer is talking about invoice #24601, there's no question that you're both referring to the same invoice. Since it's unchanging, you can go back weeks, months, or even years later and find the invoice every time. And since it's consistent, you'll never have lonely Invoices hanging out there without an identifying number. In database terms, the invoice number is called a key. A key is a field whose value uniquely, unchangingly, and consistently identifies one record.

As far as FileMaker's concerned, any field that you use to link one table to another is a key field. Although FileMaker doesn't differentiate between key types, in the larger world of database theory there are two types of keys: primary and foreign. Understanding how the two types of keys work helps you choose appropriate ones for your databases.

7.2.4.1. Primary keys

When you tell FileMaker how two tables are related, you need a way to identify each record: You need a key. On the "one" side of a one-to-many relationship, this key field is called a primary key. Invoice Number is the primary key in the Invoices table. Like the Invoice Number in your database, a primary key is most often a made-up number. You can add a new field to your table, and tell FileMaker to make up a unique value for it each time you create a record. This kind of primary key, based on purely made-up data, is called a surrogate key.


Note: Unless you generate the value in your database, it's not a surrogate key.


Occasionally, your table will have a real value that meets the requirements for a key. For example, if your Product database has a field for your internal Inventory Control Number, you may be able to use that field as the primary key. If you use your Inventory Control Number to relate tables, then it's a natural key.

FREQUENTLY ASKED QUESTION

Going Natural

Should I try to find natural keys for the tables in my database if I can?

Good question! In fact, it's right up there with Coke or Pepsi, Tastes Great or Less Filling, and To Be or Not To Be. Some of the smartest people in the world of database theory say, "Absolutely." Unfortunately other equally smart ones say, "Absolutely not." The general arguments run like this:

Natural Keys are hard to find and once you pick one, the data in that field should be hard to change. If you discover later on that the natural key you chose really does need to change sometimes, you'll be forced to work through the challenges of dealing with it. A surrogate key, on the other hand, will never need to change because it has no meaning. Your boss will never pop into your office and say, "You know what? Invoice numbers do change after all." It happens more often than you think. Suppose you work for a company that assigns an Employee ID to each employee. You're building a database to keep track of employee stock options. Just like Social Security Number, Employee ID is a surrogate key to somebody but it's a natural key to you. You decide to make it your primary key. Then you discover that you need to track stock options for employees even if they quit and then return to the company. When they do this, their Employee ID changes, and your database can't track them properly without some inconvenient upkeep. If you had used your own surrogate key instead, this wouldn't be a problem.

On the other hand, surrogate keys don't mean anything. If you're looking at the data in the Stock Options table, you might have a field called "Employee Key." If you need to put the Employee ID on an Options Exercise form to turn in to HR, your Employee Key won't do you any good. Instead, you have to look him up in your Employees table and find the correct Employee ID. If you had used a natural key instead, you'd save the extra step. In other words, meaningful keys make your life a little easier. But FileMaker's work gets a little harder. If you use surrogate keys, FileMaker always has to visit another record in another table to show any meaningful related data. With a natural key, at least one piece of useful data is stored right in the current record, as a foreign key. Although the difference is very small, natural keys can make your database run a little faster.

If you're not yet persuaded, do more research and make your own choice. A Google search for "Surrogate and Natural Keys" will bring up a wealth of pages full of arguments for either side. Just remember, the penalty for choosing an unsuitable natural key is stiff. At minimum, you risk losing the relationship between your tables in a few records. At worst, you'll spend enormous amounts of time tracking hundreds of broken relationships and rebuilding your database using a more reliable primary key.

You might be tempted to use a birth date or a Social Security number as a natural key. But remember, in order for a natural key to be reliable, it must be unique, unchanging, and consistent. Social Security numbers are supposed to be unique and unchangingbut duplicate numbers and identity theft have put a big dent in that theory. And you can't even rely on everybody in your database having a Social Security number. While most Americans have one, you'll have to redesign your database the minute your company wants to do business with a customer who doesn't live in the US.

In practice, you'll almost never find a suitable natural key. Natural keys almost always originate from surrogate keys in another database system. For instance, your database might track invoices createdand numberedby a separate accounting system. If those numbers are unique, unchanging and consistent, you can use them as a natural primary key. (See the box on Section 7.2.4.1 for more perks and perils of natural keys.)

7.2.4.2. Foreign keys

If the Invoice Number field in the Invoices table is the primary key, what is the Invoice Number field in the Line Items table? It's not a primary key because it isn't uniquelots of line items could have the same Invoice Number. In this case, it's called a foreign key. A foreign key is simply the primary key from another table, and you'll find it on the "many" side of a one-to-many relationship. A table has only one primary key, but it can have lots of foreign keys (or none at all).

But where do you put these foreign keys? In the Invoices database, how did you decide to put the Invoice Number field in the Line Items table? Why not create a Line Item ID field in the Line Items table, and put it in the Invoices database? Wouldn't that accomplish the same thing? At first glance, it might seem like both methods would produce an identical relationship. After all, they sure look the same in a picture (Figure 7-12).

Figure 7-12. These two relationships look almost identical. The only difference is that a primary key in the Invoices file relates the top tables. A primary key in the Line Items table, on the other hand, relates the bottom tables. That, it turns out, has made all the difference.

Think about what a primary key means. Each value identifies one, and only one, record in the table. If you build a relationship based on Line Item ID, then each invoice can have only one line item. If it had two, there would be two line items with the same line item ID. When you have a one-to-many relationship, you must put the foreign key in the table on the many side. Luckily, keeping this information straight in your head is a breeze. Since foreign keys belong on the to-many side of a relationship, just remember this rule: When you see a crow's foot on your ER diagram, you need a foreign key in the table to which it's attached. (But if you're getting confused, the box below should clear things up.)


Note: When you see a field name like Line Item ID, think primary key. ID, in this sense, means the field identifies a record. Primary keys in a database are often called IDs.


POWER USERS' CLINIC

Composite Keys

A key value doesn't necessarily have to be stored in just one field. Sometimes you can get the value that uniquely identifies a record by combining several field values into one. This kind of key is called a composite key. Suppose you have a database of photographs you've taken. Your Photographs table holds the photo itself, as well as some information about each photo: Date and Time Taken, Roll Number, Shot Number, and Film Speed. If you wanted to use a natural key, you could combine Roll Number and Shot Number. No two rolls have the same number, and you'll never take the same shot twice on a single roll, so this combined value is always unique.

If you choose to use a composite key, you'll have to add multiple fields to each related table that needs to store the foreign key. For example, if you sell your photographs, and you want to relate the Photographs table to an Order Line Items table, you'll have to put both the Roll Number and the Shot Number in the Order Line Items table. When you do, it will be the combination of these two fields that forms the foreign key. See Section 8.7 to learn how to link tables with a multi-key, or composite, relationship.

You can have as many fields in a composite key as you need.

 

7.2.5. Join Tables

A many-to-many relationship is more complicated than its one-sided brethren. Imagine your database tracks payments from your customers. A customer could easily send a check to cover two invoices. On the other hand, another customer might send a check to cover just part of an invoice, with the promise to send another later. So an invoice can have multiple payments, and a payment can be for multiple invoices: many-to-many.

How do you build a relationship like this? If you put the Payment ID in the Invoices table, then a payment can be applied to more than one invoice (just put the same Payment ID in each invoice record). But an invoice can have only one payment since it has just one Payment ID field. If you put the Invoice Number in the Payments table, you get the same problem in the other direction. You might be tempted to try putting a foreign key field in both tables. In other words, add a Payment ID field to the Invoice table, and an Invoice Number field to the Payments table. Dig a little deeper and you'll see that this has a whole host of problems:

Luckily, there's a better way to accomplish a many-to-many relationship. You need something called a join table (see Figure 7-13). A join table doesn't usually represent a real entity. Instead, each record represents a relationship between two records in the related tables.

Figure 7-13. A join table's main job is to facilitate a many-to-many relationship. This one holds foreign keys for Invoice Number and Payment ID. If you want to attach a payment to an invoice, add a new record in the join table, with the correct Invoice Number and Payment ID. Once the record is added, the payment and invoice are properly connected in both directions. Since you can add as many records as you want to the join table, you can flexibly relate multiple invoices and payments.

If it helps, think of join tables this way: Invoices and Payments both have a one-to-many relationship to the join table. So one invoice can connect to many join records, each of which connects to one payment. Likewise, one order can connect to many join records, each of which connects to one invoice. So you get many related records in both directions. A join table always contains two foreign keys, one from each table it's joining. If your database has many-to-many relationships, you may have to create a join table to hold just these two keys, but sometimes your database may already have a real table that can act as a join table. Figure 7-14 shows an example.

Other times, you can use a join table to hold fields that don't quite belong in any other table. Suppose you wanted to record what portion of a payment was applied to each invoice. For example, if a customer hands you a check for $100 and you have two outstanding invoices for that customer, for $80 and $30, you might want to decide how to allocate the payment. Perhaps you apply $80 to the first invoice and $20 to the second. This dollar amount applied to each invoice can't be stored in the Invoice table because an invoice might have several payments. It can't be stored in the Payment table because you have two amounts and only one payment record. The best place for it is right in the join table itself.

Figure 7-14. This database tracks orders for the products you sell. Since a product can be ordered many times, and an order can include many products, the Orders and Products tables have a many-to-many relationship. A Line Items table connects them. It records which products were on each order, but it also shows how many, and at what price. A line item is a real entity, but the Line Item table also serves as a join table.

At this point, you have a good ER diagram, you've eliminated many-to-many relationships by adding join tables, and you understand what foreign keys you'll need to make it all work. You're (finally) ready to actually build your database.

Категории