Verified by Garnet Grid

Data Warehouse Modeling with Kimball

Design dimensional data warehouses. Covers star schema, snowflake schema, fact and dimension tables, slowly changing dimensions, conformed dimensions, and ETL design patterns.

Dimensional modeling is the foundation of every analytics warehouse. The Kimball methodology organizes data into facts (things that happened) and dimensions (context about what happened). This structure makes queries intuitive — business users can slice and dice facts by any dimension without understanding joins or database internals.


Star Schema

              ┌──────────────┐
              │ dim_customer  │
              │ customer_key  │
              │ name          │
              │ segment       │
              │ region        │
              └──────┬───────┘

┌──────────────┐     │     ┌──────────────┐
│ dim_product   │     │     │ dim_date      │
│ product_key   │     │     │ date_key      │
│ name          ├─────┤     │ date          │
│ category      │     │     │ month         │
│ brand         │     │     │ quarter       │
└──────────────┘     │     │ year          │
                     │     │ is_holiday    │
              ┌──────┴──────┴──────┐
              │    fct_orders       │
              │ order_key           │
              │ customer_key (FK)   │
              │ product_key  (FK)   │
              │ date_key     (FK)   │
              │ quantity            │
              │ unit_price          │
              │ discount            │
              │ total_amount        │
              └─────────────────────┘

Fact Table Types

TypeWhat It StoresGrainExample
TransactionIndividual eventsOne row per eventEach order line item
Periodic snapshotState at regular intervalsOne row per periodMonthly account balance
Accumulating snapshotLifecycle milestonesOne row per entityOrder: created → shipped → delivered
Factless factEvent occurrence (no measures)One row per eventStudent attendance, page views

Slowly Changing Dimensions (SCD)

SCD TypeHow It WorksWhen to Use
Type 1Overwrite old valueDon’t need history (fix typos)
Type 2New row with version datesNeed full history (address changes)
Type 3Add column for previous valueNeed current + one previous
Type 6Combines Types 1, 2, 3Need history + current flag

SCD Type 2 Implementation

-- dim_customer with SCD Type 2
CREATE TABLE dim_customer (
    customer_key    SERIAL PRIMARY KEY,  -- Surrogate key
    customer_id     INT NOT NULL,        -- Natural key
    name            TEXT NOT NULL,
    segment         TEXT NOT NULL,
    region          TEXT NOT NULL,
    effective_date  DATE NOT NULL,
    expiry_date     DATE DEFAULT '9999-12-31',
    is_current      BOOLEAN DEFAULT TRUE
);

-- Customer moves from "SMB" to "Enterprise"
-- Step 1: Expire current record
UPDATE dim_customer 
SET expiry_date = CURRENT_DATE - 1, is_current = FALSE
WHERE customer_id = 123 AND is_current = TRUE;

-- Step 2: Insert new record
INSERT INTO dim_customer (customer_id, name, segment, region, effective_date)
VALUES (123, 'Acme Corp', 'Enterprise', 'US-West', CURRENT_DATE);

dbt Implementation

-- models/marts/core/fct_orders.sql
{{
  config(
    materialized='incremental',
    unique_key='order_key',
    on_schema_change='append_new_columns'
  )
}}

WITH orders AS (
    SELECT * FROM {{ ref('stg_orders') }}
    {% if is_incremental() %}
    WHERE updated_at > (SELECT MAX(updated_at) FROM {{ this }})
    {% endif %}
),

customers AS (
    SELECT * FROM {{ ref('dim_customer') }}
    WHERE is_current = TRUE
),

products AS (
    SELECT * FROM {{ ref('dim_product') }}
),

date_dim AS (
    SELECT * FROM {{ ref('dim_date') }}
)

SELECT
    {{ dbt_utils.generate_surrogate_key(['orders.order_id', 'orders.line_number']) }} AS order_key,
    customers.customer_key,
    products.product_key,
    date_dim.date_key,
    orders.quantity,
    orders.unit_price,
    orders.discount,
    orders.quantity * orders.unit_price * (1 - orders.discount) AS total_amount,
    orders.updated_at

FROM orders
LEFT JOIN customers ON orders.customer_id = customers.customer_id
LEFT JOIN products ON orders.product_id = products.product_id
LEFT JOIN date_dim ON orders.order_date = date_dim.date

Anti-Patterns

Anti-PatternProblemFix
Natural keys as fact table keysSlow joins, data quality issuesSurrogate integer keys for all dimensions
Snowflaking everythingComplex queries, poor performanceStar schema (denormalize dimensions)
No date dimensionCan’t slice by fiscal year, holiday, etc.Pre-built date dimension with business attributes
One Big Table (OBT)Redundant data, hard to maintainProper star schema with fact + dimensions
No grain documentationAmbiguous what each row representsDocument grain for every fact table

Checklist

  • Grain defined for every fact table
  • Surrogate keys on all dimension tables
  • Date dimension built with fiscal/business attributes
  • SCD strategy chosen per dimension (Type 1, 2, or 3)
  • Conformed dimensions shared across fact tables
  • Incremental loading for large fact tables
  • Data quality tests on all models
  • Documentation: grain, business definitions, lineage

:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For data modeling consulting, visit garnetgrid.com. :::

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →