What Is a Database?

Database terminology is almost as slippery as the term "object-oriented programming." The word "database" is used to describe everything from a single set of data, such as a telephone list, to a complex set of tools, such as SQL Server, and a whole lot in between. This lack of precision isn't a bad thing, necessarilyit's just the nature of languagebut it's not particularly useful for our purposes, so I'll try to be a bit more precise here. Figure 1-1 shows the relationships between the terms discussed here. We'll define these terms in this chapter, and examine them in detail in the rest of the book.

Figure 1-1. Relational Database Terminology

Although relational databases don't have real-world analogies, most are intended to model some aspect of the real world. I'll call that bit of the real world the problem space. The problem space, by its nature, is messy and complexif it weren't, we wouldn't need to build a model of it. But it is critical to the success of your project that you limit the database system you're designing to a specific, well-defined set of objects and interactions; only by doing so can you make sensible decisions about the scope of your system.

I'll use the term data model to mean the conceptual description of the problem space. When you're working with the relational model, this includes the definition of entities, their attributes (a Customer, for example, is an entity, and it might have the attributes Name and Address), and the entity constraints (such as, for example, that the CustomerName field cannot be empty). When you're working with the dimensional model, this definition consists of facts and dimensions, but we'll discuss those in Part II.

The data model also includes a description of the relationships between entities and any constraints on those relationships. For example, managers are not allowed to have more than five individuals reporting to them. It does not include any reference to the physical layout of the system.

The definition of the physical layoutthe tables and views that will be implementedis called the database schema or just the schema. It's the translation of the conceptual model into a physical representation that can be implemented using a DBMS. Note that the schema is still conceptual, not physical. The schema is nothing more than the data model expressed in the terms that you will use to describe it to the database enginetables and triggers and such creatures. One of the benefits of using a database engine is that you don't ever have to deal with the physical implementation; you can largely ignore B-trees and leaf nodes and other low-level physical concepts.

Once you've explained to the database engine what you want the data to look like, using either code or an interactive environment such as Microsoft Access or the SQL Server Enterprise Manager, the engine will create some physical objects (usually, but not always, on a hard disk someplace) and you'll store data in them. The combination of structure and data is what I'll refer to as a database. This database includes the physical tables that store the data; the defined views, queries, and stored procedures used to retrieve the data in various ways; and the rules the engine will enforce to protect the data.

The term "database" does not include the application, which consists of the forms and reports with which your users will interact, nor does it include any of the bits and piecesthings such as middleware or Internet Information Serverused to stick the front and back ends together. The term "database" also excludes the database engine. Thus, an Access .mdb file is a database, while Microsoft Jet is a database engine. Actually, an .mdb file can contain application objects as wellforms and reports, for examplebut that's a topic we'll discuss later.

To describe all these componentsthe application, the database, the database engine, and the middlewareI'll use the term database system. All of the software and data that goes into making a production system is part of the database system.