The incident
The orders warehouse looked like a success. It loaded on time, the numbers tied to the source, and the first dashboards shipped in six weeks. One decision was made early, almost without noticing: the fact table held one row per order — order date, customer, total amount. Clean, small, fast.
Finance wanted revenue by product line. But an order could contain many products, and only the order total had been stored. The detail was discarded at load time — there was no way to attribute revenue back to products. The answer wasn't "run a different query." It was "we do not have that data in a usable form."
The fix wasn't a patch. It meant rebuilding the fact table at one row per order line, re-ingesting three years of history, re-pointing every report, and re-validating every published number. Start to finish: roughly six months.
The decision that cost six months was one sentence, never written down: "one row per order." That sentence is a declaration of grain — and it was chosen for the wrong reason: the smallest, cleanest table, rather than the grain that could answer the questions the business would ask.
The four decisions, in order
A dimensional model is built by a disciplined sequence — four decisions, each constraining the next. Skipping or reordering them is how warehouses go wrong. Tap each one.
Business process
Which events?
→Declare the grain
What is one row?
→Choose dimensions
The "by" words.
→Identify facts
The measurements.
Lab · Reproduce the six-month mistake
Pick the grain for the orders fact table, then ask it the questions the business actually asked. Watch which grain can answer — and which one threw the detail away.
The grain picker
Step 1 — choose how much detail one row records:
Step 2 — this is what the fact table physically holds at that grain:
Step 3 — now let the business ask its questions:
You can always aggregate up from atomic detail. You can never disaggregate down into detail you discarded. Revenue by product is a GROUP BY away if you stored order lines — and unrecoverable if you stored only totals.
Lab · Can you sum this fact?
A fact's additivity governs how it may be safely aggregated. Get it wrong and the dashboard renders a confident, incorrect number. Classify each measure — additive, semi-additive, or non-additive.
Additivity classifier
Lab · The history that quietly rewrote itself
A customer moves from the West region to the East in March. They made a sale in January. When you re-run last year's regional revenue report, where should that January sale appear? Your SCD choice decides.
Slowly changing dimensions
Customer dimension
Last year's regional revenue
Before you design a model
Run this before the first table exists. Each item is cheap now and expensive later. Tap to check them off.
You know your model is correct because you decided the grain deliberately, stored atomic detail, split measurements from context, keyed dimensions on surrogates you control, and chose an SCD type for each attribute by asking whether its history matters. Correctness is the residue of decisions made on purpose, in order — before the cost of reversing them became six months.