Team LiB
Previous Section Next Section

Database Schema Components

Once you've completed the conceptual data model and decided on the system architecture, you have most of the information you need to build the database schema. The database schema is a description of the data objects you will include in the database. If you've chosen anything other than a standalone data architecture, the database schema will also define where each object is to be deployed.

If you're implementing your system in Access, your database schema will include the definition of each table, query, and relationship. It will not include a description of the system's forms, reports, and code components, even though these are also stored in an .mdb file. If you're implementing the system in SQL Server, your database schema will include the definition of each table, view, stored procedure, and trigger in the database.

Defining Tables and Relationships

The definition of the physical tables in the database schema is derived directly from the conceptual data model. Entities become tables, and the table's fields are the entity's attributes. For the most part, this is a simple process of direct translation. The only areas that require special attention are constraints, relationships, and indices.


As part of defining the conceptual data model, you defined constraints for entities, attributes and domains. Whether or not you implement these constraints in the database schema depends, as we've seen, on the choices you made about the system architecture. As I've said, some designers prefer to implement all constraints only at the Data Interface and Transaction Interface layers of the four-layer model, or the Business Services layer of the three-tiered model.

In most circumstances, I recommend that you implement constraints at both levels. Assuming that you agree with me and have decided to include constraints in the database itself, you will define them as part of the database schema. We discussed implementing data integrity in detail in Chapter 4, but it's worthwhile to review it here.

Most domain and attribute constraints will become field-level constraints in the database schema, usually as validation rules in Access. Access also supports the CHECK constraint clause used by SQL Server if you chose to create the database using SQL statements rather than DAO or the Access user interface.

Entity-level constraints usually become table constraints, again, either as validation rules or through the SQL CHECK constraint. You implement the entity integrity constraint, which specifies that each instance of an entity be uniquely identifiable, by defining a primary key for each table.

Whether you're implementing the database using SQL Server or the Jet database engine, you might find that some of the constraints defined in the conceptual data model can't be implemented as part of the table definition. In SQL Server, you might be able to enforce the constraint using a trigger. Since the Jet database engine doesn't support triggers, you'll need to implement these constraints as part of the application.


We discussed the ways relationships between entities are modeled in a relational database in Chapter 3 and again in Chapter 13. The first step is always to include a unique identifier from the primary relation in the foreign relation. At the level of the database schema, this means that the primary key fields from the primary table will be included in the foreign table.

Some designers stop there, preferring to handle referential integrity only in the application rather than allow the database engine to do so. As with all database validation, in my own work I do both: I validate referential integrity in the application for usability and in the database engine for safety. I suppose if I were a man, I'd wear both a belt and suspenders.

We discussed previously the importance of indices for system performance. Every table should have at least one index, which the database engine will create automatically when you declare the primary key. In addition, you should create an index on any field or combination of fields that you will use to join tables. This usually isn't a problem for the table that represents the primary relation since the fields used for the join are the primary key. However, you might need to declare additional indices in the table representing the foreign relation if the field or fields used for the join don't make up the entire primary key in the foreign relation.

If the foreign key field or fields participate in the primary key but aren't the whole key, I define a separate index on the foreign key. For example, an OrderItems table usually has a primary key of {OrderID, ItemID}. Even though the primary key index could be used to join the table with the master Orders table in most circumstances (well, every circumstance I can think of), I would probably still create a separate index on OrderID just to be certain.

Any fields that will be used to sort the data should also be indexed. For example, customer lists are usually sorted by customer name and orders by date, even though neither of these fields usually participates in the primary key or forms part of a join. Indexing the fields will make the sorting process easier and more efficient.

It's possible to go overboard in creating indices, so be careful here. Remember that a small, but cumulative, amount of overhead is involved in maintaining each index. Any field that will be used for sorting the table frequently should be indexed, but you can always use the SQL ORDER BY clause to sort records without using an index.

The practical maximum number of indices per table really depends on how often the table is updated. (The overhead is incurred only when a record is added or the indexed field is updated.) For a table such as Orders that the system will update more or less constantly, I'd be careful about maintaining more than 10 or 15 indices, including those used to support joins and the primary key. You might be justified in using more indices on a Products table, on the other hand, which is typically updated infrequently but used in many ways throughout a system. As always, you must base the decision on how the data is to be used.

Views and Queries

Both Access and SQL Server provide a mechanism for storing SQL SELECT statements. These stored statements are called views in SQL Server and queries in Access. (I'll call them queries here, as that's the more common term.) In most cases, using a stored query will be faster than executing a SELECT statement on the fly; this isn't always the case, but the situations in which it isn't are so quirky that you can take this as a general rule.

You can begin deciding which queries to include in the database schema by examining the conceptual data model for complex entities. Remember that a complex entity is a single logical entity that is modeled by two or more tables for efficiency. You should include a query that de-normalizes any complex entities in your model. Most of these will be tables in a one-to-many relationship such as Orders and OrderItems, but you might also have complex entities that have been sub-classed with one-to-one relationships, and you should include queries to support them as well.

Users will frequently need to find particular records in the primary entities in the systema specific customer or order, for exampleand this is the second place to look for queries to be included in the database schema. All these common searches should be supported by a parameter query allowing users to specify the particular record to be found at runtime.

Sometimes you'll need to provide more than one "find" query for an entity. A user might need to find an Order by looking up the OrderDate, CustomerID, or OrderID, for example. Each of these should be supported by a separate parameter query.

On the other hand, users won't search through all tables. You might have a table in the database schema that contains a list of U.S. states. These lookup tables are extremely useful, but it's extremely unlikely that users will ever need to search for a specific state record.

You should also look for queries in the forms and reports implemented by the application. You'll need queries to link the fields and also to support lookups such as those used for combo boxes. If the system has form dependencies, you'll need a parameter query to support them as well. An example of this might be a dialog box that's called from an order entry form to display Customer details.

Based on the system's work processes, you will also want to include in your database schema queries (and perhaps stored procedures in SQL Server) that perform actions. If you know that the system will regularly archive orders or update product prices, it will be more efficient to support these activities with queries or stored procedures than to issue the commands on the fly.

Additional action queries will probably be added to the database schema during implementation. Unlike indices, essentially no overhead is involved in queries and stored procedures once they're implemented, so you need not hesitate about adding them to the database schema.

Remember that systems development is not a strictly linear process. While changes to the tables themselves during implementation can cause problems (and the further in the development process, the greater the problems), adding queries to the schema is trivial and to be expected.

    Team LiB
    Previous Section Next Section