Summary

In this chapter we've looked at the structure of fact tables in a dimensional database. We saw that a fact table consists of two distinct types of attributes: dimensional keys and facts. Dimensional keys are similar in structure and function to the foreign keys of the relational model in that they link rows in the fact table to rows in the dimensional tables.

The facts in a fact table represent the values to be measured for a specific combination of dimension values. The ideal fact has two characteristics: it is numeric and fully additive. Non-numeric facts are not necessarily incorrect, but when presented with one, you should double-check that it's not actually a dimension in disguise.

Non-numeric facts are, by definition, non-additive. They can be counted or listed, but not summarized in any other manner. Fully additive facts can be counted and listed, and also summarized over any dimension. Semi-additive facts are numeric, but can only be added across some, not all, dimensions.

The grain of a fact table is the level of detail at which facts are captured. You should always use the finest possible grain because the more detail, the more ways the data can be analyzed. You should, however, expect that not all of the source data will be available at the same grain and that determining the formula for allocating these high-level facts to the lower levels can be a contentious process.

There are three primary types of fact tables. The most common is the transaction table, which models individual transactions such as a line item on a sales order. Transaction tables are often combined with snapshot tables, which measure values at specific points in time. The final type of fact table is the coverage table, which may not have any fact attributes at all. Coverage tables most often record that an event occurred.

Finally, we examined handling heterogeneous facts. In a process similar to sub-classing entities in the relational model, heterogeneous facts are handled using multiple fact table/dimension table pairs, a single pair at the generic level and a pair for each category. Unlike sub-classing, however, heterogeneous facts duplicate the core facts in each category-specific table. This avoids the necessity for large-table joins.