Fact tables represent business measurements in a dimensional model (more info: 1, 2, 3). They form the center of the star schema and act as a hub among dimensions for “slice-and-dice” operations to create interesting many-to-many relationships. Measurement data in a fact table is most useful if it is additive (prices and number of units for example); however, non-additive data (such as a percentage) as well as partly-additive data (a metric that is only additive in certain contexts) can also be useful under the right circumstances.
Granularity is the defined level of detail of a measurement. The more granular, the greater the detail. The granularity of the fact table will help determine its type and application. More detail allows you to aggregate data in different ways. If there is not enough detail, then drilling down into the data won’t be possible. Here are the three basic types of fact tables as defined by their grain:
These fact tables are common and represent a business function at the transaction level, often representing the lowest-possible level of data available. An example would be a line item on an invoice or sales receipt; a visit, or “hit”, to a page on a website; or perhaps a single pitch in a baseball game.
Accumulating Snapshot grain
This type of fact is not as common as the transaction-level fact, but very important, especially if you are interested in tracking the lifetime of a process. For example, you could use an accumulating snapshot to store a product’s movement through the supply chain. Each row in the fact table would contain date foreign keys that represent milestones in the process (product ordered, removed from inventory, shipped, received, etc.). This fact would compliment your transaction-level fact table, giving you the ability to view a process’s performance.
Periodic Snapshot grain
The periodic snapshot, like the type implies, is a view of a business process at the end of a specific interval. A summary of measurement data is taken at consistent intervals, such as by hour, day, week, or month. These measurements can then be used for trend analysis and to test business performance over time, without the overhead of aggregating transaction-level details.
Each of these fact table types have different applications in the warehouse. Transaction grain is most intuitive to business users and queries against these tables tend to be simple. If the level of detail in the transaction fact is fine enough, then a periodic snapshot will simply be an aggregate of this activity. If the grain is not fine enough, then the periodic snapshot will give additional insight into the operations of the business by providing a better and complimentary view of business data. Accumulating snapshots work exceedingly well when tracking a process over time. Product orders are a great example, as they move from point-to-point in the supply chain. Any business process that expands beyond a single moment in time is a candidate for an accumulating snapshot.
Next, I will discuss different types of dimensions, followed by examples using facts and dimensions in a data warehouse.