After you've been through all your source documents, you'll have a draft description of the entities and attributes in the problem space. Two tasks remain: establishing the relationships between these entities, and reviewing each entity's attributes and constraints.
Although theoretically you could review the attributes first, I find it easiestto start with the relationships, because some of these will become additional entities and some will require that attributes be added to entities you've already identified.
If you're like me, what you have after your first pass through the source documents is a bunch of handwritten notes with arrows and scribbles and "see page 12"s that nobody else could possibly decipher. So the first step in defining relationships is to get these organized neatly. You can start by building the first draft of the Entity Relationship (E/R) diagram of your data model. (If your notes are really messy and you're worried that even you might not be able to read them in three weeks, you might also list the attributes you've identified for each entity.)
Start by choosing an entity, usually one of the core entities in the system, and then add the entities that have some relationship with it. You can define the nature of the relationships (one-to-one, one-to-many, many-to-many) as you go, or you can simply draw a straight line to remind yourself that there is a relationship, and then come back and analyze it later. I usually analyze as I go, but you might find it easier to get all the entities down first and then review them.
The first draft of the E/R diagram for the order-processing example is shown in Figure 12-4. This is a simple example, and the diagram is quite easy to read. (Assume we've decided that Salesperson is an attribute only of the Sales Order entity and not an entity in its own right.) If you're working on a complex example, you might want to create multiple diagrams, each one describing only a subset of the data. In this case, it's a good idea to use some sort of automated support for the diagrams. Otherwise, making sure they're synchronized can get tedious.
Figure 12-4. The First Draft of the Order Processing E/R Diagram
Once you have your draft E/R diagram prepared, you can begin to analyze the relationships in more detail. For each relationship, you want to determine the following:
Figure 12-5 shows the Order Processing E/R diagram after review.
Figure 12-5. The Order Processing E/R Diagram after Reviewing the Relationships
The Cardinality of the Relationship
You might already have indicated the relationships between entities in your first draft, as I did in Figure 12-4. If not, now is the time to do so. Even if you did, it's a good idea to review your decisions now that you have a more complete picture of the whole model.
Where you discover a many-to-many relationship, you should add the junction entity to the model, with a one-to-many relationship on each side. The relationship between Supplier and Product in our model is many-to-many, and so we need to add the entity Product Supplier to resolve it. Note that the relationship between Sales Order and Product is also many-to-many, but in this case the Order Detail entity functions as the junction.
The Optionality of the Relationship
Having established the kind of the relationship between any two entities, you should now consider whether the relationship is optional for either or both participants. In our example, the relationship between Customer and Shipping Method is optional in both directionsthat is, customers are not required to have a default shipping method, and shipping methods can exist without being used by a customer.
The relationship between Product Category and Product, on the other hand, is only optional in one direction. A Product Category need not have products assigned to it, but all products must be assigned to a Product Category.
The relationship between Sales Order and Shipping Method is even more complex. A shipping method can exist independently of a sales order, so the Sales Order side of the relationship is optional. The Shipping Method side of the relationship, however, is optional only if the sales order has special instructions. This is an important constraint and should be noted on the diagram.
The Attributes of the Relationship
In most situations, all that you need to record about the relationship between two entities is the fact of its existence. We need to know, for example, that a specific Customer placed a given Sales Order, and that's all we need to know. Sometimes, however, we need to know additional facts about the relationshipwhen it began or how long it lasted, for example. These facts are attributes of the relationship itself, not of either of the participants.
Where the relationship itself has attributes, it must be modeled as an entity. In the order-processing example, we might decide to designate one Supplier as having "Preferred Supplier" status. Since we already have a junction entity between Product and Supplier, the Preferred Supplier attribute can simply be added to that entity. If that were not the case, we would have needed to add an entity to represent the attributes of the relationship.
Additional Constraints on the Relationship
Finally, we want to consider whether any additional constraints pertain to the relationship. What are the minimum and maximum number of records that can exist on the many side of a one-to-many relationship? Are there any conditions that must be met before the relationship is allowed to exist? Are there any conditions under which the relationship must exist?
In our example, the requirement that the relationship between a Sales Order and a Shipping Method is optional only if Special Instructions have been indicated is one such constraint. The rule that customers cannot place an order until after their credit has been approved is another. Again, this rule has been indicated by an annotation on the diagram. If there are many constraints or the constraints are too complex for a simple annotation, you might need to document them elsewhere. You should, however, at least indicate on the diagram that such constraints exist.