Data

Modeling Temporal Data - Background (Part 1)

2026-06-21 Bitemporal Event Sourcing Data Vault

This blog series contains my thoughts on how to model temporal data. This initial entry provides the background by (1) defining what temporal data is, (2) explaining when it's needed, and (3) uncovering the challenges with implementing it.

Standard Situation

Most data models just store data representing the current state of the system. In most domains, only knowing the current state is fine because very few attributes are updated. In other words, most of the individual data points are written via inserts versus updates.

Those few attributes that change organically are referred to as slowly changing dimensions. Examples include full name, mailing address, account status, and account tier. While knowing the value of these attributes at any point in time might be beneficial, the complexity of adding temporality to a data model usually outweighs the benefit. Do you really need to know that the customer's name was Joan Smith when shipment A was sent two years ago, versus the current value of Joan Adams? Probably not.

When Do We Need Temporal Data?

Temporal data is needed when the domain:

I will be focusing on property and casualty (i.e. general) insurance since that domain is my expertise. In my experience over 30+ years, most insurance systems (both core insurance applications and analytics platforms) rely on the Standard Situation approach. While simpler to model and maintain, it leaves behind critical temporal information that is needed to run and evaluate the business.

As an example, let's consider an insurance policy. A policy is the contract between an insurance company and an insured (purchaser of policy) that indemnifies the insured for financial loss caused by covered perils / events related to specific covered people and assets. There are many attributes that can change over time for a policy, such as (1) coverage limits, (2) coverage deductibles, (3) what assets are covered, (4) who is covered, etc. A specific policy change example: a family member begins attending college and is eligible for the Student Away at School discount.

Insurance policies are not limited to a few slowly changing dimensions; significant parts of a policy's state can change during the life of the policy. Insurance systems need to know the exact state of a policy at various points in time to (1) provide the exact coverage that was in force at the time a claim occurred, (2) provide accurate financial reporting as of fixed accounting dates, (3) provide accurate analytical data as of fixed evaluation dates, etc. If only the current state is available, it is not possible to time travel the data for these use cases.

Temporal Modeling Option: Bitemporal

I recommend reading this entry from Martin Fowler on bitemporal history. The key concept is temporal data has 2 time dimensions: (1) when something is valid in "real life" (actual history), and (2) when the database knows that something in "real life" is valid (record history). Insurance data is rife with retroactive changes, thus necessitating bitemporal vs unitemporal data. For example, a policy change that is effective on July 1st might not be entered into the database until August 1st. We would need to know that the policy as of July 1st looked different on July 15th versus on August 15th.

Bitemporal data models are very complex and have uneven support in standard SQL databases. Specialized databases exist but are very niche. When selecting a database, I always follow the herd and choose one of the popular options.

Which leads to another alternative: event sourcing.

Temporal Modeling Option: Event Sourcing

I recommend reading this entry from Martin Fowler on event sourcing. Much more has been written about event sourcing and I encourage anyone to fall down that rabbit hole. Another more in-depth entry can be found in this presentation from Martin Kleppmann about turning the database inside out. The core idea is not to store state, but to store both the events that modify state and the resulting alterations to state. Point in time state is then derived by replaying the events up to the desired point in time.

Event sourcing suffers from the same issues as bitemporal data models: it's very complex, not well supported in standard SQL databases, and specialized databases exist but are niche. When using a standard SQL database, the method must be hand-rolled onto the database and manually maintained (or find an extension that implements the tooling).

Temporal Modeling Option: Data Vault 2.0 (DV2)

This modeling technique is used for downstream use cases (e.g. operational or analytic data stores) rather than as a core backend system data model. The primary use case for DV2 tends to be as an integration layer for combining data from many (10+) different systems. Another key DV2 use case, however, is storing temporal data in downstream systems.

Learning more about DV2 is another big rabbit hole. While free content exists, the most authoritative content sits behind various paywalls (books and training). Based on my review of what's available, I would recommend the book "The Elephant in the Fridge" by John Giles. I believe it provides the best balance of simplicity and thoroughness.

Ten years ago I conceived of a temporal analytic data model that, as it turned out, was similar to the temporal concepts in DV2. Instead of a single table for a business object or concept, the DV2 data table is parsed into 3 tables:

Both the satellite table and the link table are versioned either unitemporal (record history only) or bitemporal (via Effectivity Satellites).

The two main DV2 concepts that will be explored in future posts in this series:

Definitions

While blog posts are usually handwritten by me, this definitions section is an exception. These definitions include direct assistance from GenAI robots.

Attribute

A descriptive (non-key) column in a table that is a property of an object. Each row stores one data point for that attribute.

Data Point

The single value located at the intersection of a specific column and row in a table. If the table were a spreadsheet, the data point would be the value of a single cell, like $A$1.

Domain

The specific industry, subject area, or real-world problem a software application is built to model and solve.

State

The snapshot of a system's database values at a single moment in time. The current state is that snapshot taken at the present moment.

Temporal Data

Data that records the history of how an attribute value changes over time. The data is modeled so you can see a snapshot of the entire system as it existed at any point in time. In other words, temporal data is data that you can readily time travel. For more details, see temporal patterns.