Loading documentation...
Source of truth: Canonical schema-rationale document for v2 data modeling decisions.
Status: Decision document. Authored 2026-04-29, aligned to deployed v2-002-filter-codes-migration-narrative.md on 2026-04-29.
Scope: Pre-ingest schema decisions for the v2 fire_data schema.
Related: ../../semantics-and-sources/taxonomy-and-eu-firestat-alignment.md, ../conceptual-tables-and-grain.md.
This document captures the design decisions made for storing fire statistics data from BRASK, BRIS (Police, Brigade, DSB), and SSB into the HeatWaves Supabase schema. It exists because some of these decisions are expensive to reverse (require full re-ingestion of historical data), so they need to be made deliberately, before the first ingest run, and documented for future reference.
Enable complete analytical freedom over Norwegian fire statistics through a schema that is:
The driving analytical use cases are:
Use case #3 is the design driver for several decisions below.
Every property of a fact row falls into exactly one of three buckets. The bucket determines storage shape.
| Bucket | UI role | Stored as | Examples |
|---|---|---|---|
| Dimension | Show bars/series per value | dim_code + cat_code rows | KILDE, ÅRSAK, ARNESTED, OBJEKT, BYGNINGSALDER, BYGNINGSTYPE |
| Filter axis | Apply uniformly across all bars | Nullable column on fact rows | fire_type_code, building_type_code, building_age_code, region_code |
| Metric | Sum/average/count this number | Nullable NUMERIC column | value (count), value_kr (NOK) |
"Does the UI show bars per value of this thing?"
dim_code + cat_code rows.CONFINED (standard) or a native token into get_fire_data — the RPC resolves filters.id — then aggregate over the active dimension.The schema does not enforce which bucket you pick — Postgres stores either shape happily. The decision is about how you want to query.
dim_code (facts and categories)Facts reference dimensions.dim_code as an INTEGER PK. EU-aligned analytical axes use the fixed ids from docs/ingestion/adapter-mappings/category-native-to-cat-code.md:
dim_code | dimensions.name (typical) |
|---|---|
1 | Årsak |
2 | Tennkilde |
3 | Objekt |
4 | Arnested |
Additional axes (e.g. Bygningsalder, Bygningstype) get their own integer rows in fire_data.dimensions. Authoritative dim_code values: 5 = Bygningstype, 6 = Bygningsalder — see category-native-to-cat-code.md §6–§7.
This is the rule that makes pre-ingest decision-making critical.
Adding a new column later is cheap — ALTER TABLE ADD COLUMN, run adapters again, populate. No re-fetch from source APIs needed if the data is already in fact rows in some form.
Demoting a dimension to a filter (or vice versa) is expensive — these are different storage shapes. Dimension data lives as separate rows (one per category value); filter data lives as a column on existing rows. Reshaping requires re-fetching from source APIs and re-writing fact tables, because the rows themselves don't carry the information in the right arrangement to reconstruct the other shape.
| Change | Cost |
|---|---|
| Add a new dimension | INSERT INTO dimensions + INSERT INTO categories + new adapter method. Zero schema changes. |
| Add a new filter axis | ALTER TABLE ADD COLUMN (nullable) + update natural-key unique index + add/update filters mappings + update RPC + run adapters. |
| Add a new metric | ALTER TABLE ADD COLUMN (nullable NUMERIC) + update RPC dispatch. |
| Add a new source | INSERT INTO sources + new adapter file. |
| Add new categories to existing dimension | INSERT INTO categories. |
Update standard_code mappings | Plain UPDATE on categories. |
| Change | Cost |
|---|---|
| Demote a dimension to a filter | Full re-ingest of all data from sources providing it. |
| Promote a filter to a dimension | Full re-ingest of all data carrying it. |
| Add a filter axis but only ingest combined slices | Full re-fetch with split parameters from source APIs. |
| Change fact-table grain (daily ↔ yearly) | Full re-ingest. Already split correctly. |
Rule: at ingest time, always fetch the most granular slice the source API offers, even if today's UI doesn't use that granularity. Granular ingest is a few extra API calls; coarse ingest plus future regret is a full re-fetch.
| Property | dim_code | Native term | Sources | Rationale |
|---|---|---|---|---|
Årsak | 1 | Årsak, Brannårsak | BRASK, Police, Brigade | Primary causal factor (EU Firestat #4). Cross-source via standard_code. |
Tennkilde | 2 | Tennkilde, Kilde | BRASK, DSB | Heat source breakdown is the core EU Firestat #11 analysis. |
Objekt | 3 | Brannen startet i, Hva startet | Police (Sub-A), DSB | Item first ignited (EU Firestat #12). |
Arnested | 4 | Antatt arnested | Police (Sub-A), DSB | Area of origin (EU Firestat #10). |
BYGNINGSTYPE | 5 | Bygningstype, Matrikkel codes | BRASK, Police (Sub-A), SSB | Granular building sub-type (46 Matrikkel codes). |
BYGNINGSALDER | 6 | Bygningsalder | BRASK only | Building age bracket. Single-source dimension; native-vocab drill-down. |
| Column | Domain | Sources carrying it | Rationale |
|---|---|---|---|
fire_type_code | filters.id (axis='fire_type') | NULL | Incident sources carrying type | Native codes live in filters; standard_code exposes CONFINED / DEVELOPED for cross-source RPC filters. |
building_type_code | filters.id (axis='building_type') | NULL | BRASK, BRIS Police Sub-A | Coarse RESIDENTIAL / commercial / other via standard_code. |
building_age_code | filters.id (axis='building_age') | NULL | BRASK — see §5 | Cross-tab driver; natives (e.g. age brackets) live in filters. |
region_code | filters.id (axis='region') | NULL | TBD | Populate filters before regional ingest. |
| Column | Type | Sources | Rationale |
|---|---|---|---|
value | BIGINT NOT NULL | All | Fire count (the universal metric). |
value_kr | NUMERIC NULL | BRASK only | Insurance damage in NOK. NULL on non-BRASK rows. NULL metrics ≠ NULL filters: rows are still included in queries; the metric just contributes nothing. Single-source metrics are fine. |
Two properties are stored as both a dimension and a filter axis. This is intentional and serves different analytical use cases at different granularities.
BYGNINGSTYPE dimension + building_type_code filterbuilding_type_code. Two ingestion paths, two analytical capabilities, one fact row.BYGNINGSALDER dimension + building_age_code filterbuilding_age_code. This produces the joint distribution needed for cross-tab analysis.For a given dimension (e.g., cause), produce N matrices — one per building age bracket — where each matrix is cause × year:
Matrix 1 — Building age 0-10: cause × year (count of fires)
Matrix 2 — Building age 10-20: cause × year
Matrix 3 — Building age 20-50: cause × year
Matrix 4 — Building age 50+: cause × year
This is a joint distribution — each value answers "fires of cause X in age bracket Y in year Z". This is fundamentally different from marginal distributions (cause-only or age-only).
building_age_code must be a filter, not just a dimensionIf Bygningsalder were stored only as a dimension (dim_code = 6 for Bygningsalder + cat_code), the cause rows would have no age tagging. The schema could only answer "fires by cause" OR "fires by age", never "fires by cause AND age". Marginal data cannot be used to compute joint statistics — the information was never recorded together.
Storing building_age_code as a filter FK tagging joint slices — same dim_code /
cat_code identification plus building_age_code pointing at the bracket's
filters row.
BRASK's API permits cross-filtering: every category within a dimension can be selected as a filter, and queries can request a different dimension as the row breakdown while filtering by another. Concretely:
Fetch cause breakdown WHERE Bygningsalder = "51-75" AND Type = "Kald" AND BygningType = "Boliger"
This is what makes the joint data extractable from the source. The schema's job is to preserve that joint structure.
For each BRASK characteristic (Kilde, Årsak), the adapter iterates the cross-product. HeatWaves has one building-type filter: building_type_code (plus filters.standard_code for cross-source buckets like RESIDENTIAL vs other). On BRASK, building-type semantics are supplied as native næring (naering in the Utforsk JSON) — not a second HeatWaves axis.
fire_type × 2 (BRASK `type`: Varm/Kald — maps to fire_type_code / filters)
building_type × 23 (BRASK `næring` ids '0'..'21','99' — canonical list in [`BRASK_EXPLORER_DIMENSIONS_JSON`](../../../lib/data-model/adapters/_braskWebForms.constants.ts) `dimensionValues.naering`)
building_age × N (Bygningsalder brackets)
= 2 × 23 × N fetches per characteristic when iterating all næring natives. With building_type fixed to one sector (e.g. naering='0' / Beboelse only), collapse the middle factor to 1 → 2 × N. Each native id maps to its own filters.id and standard_code (see seed_brask_naering_building_type_categories.sql).
This raises ingest API call volume but is the price of analytical depth. Per the rule in §3: fetch granular at ingest, never coarse-and-regret.
get_fire_data)The deployed schema uses one RPC engine for standard charts and cross-tabs:
fire_data.get_fire_data.
get_cross_tab RPC should be introduced.get_fire_data.// BRASK source id matches lib/data-model/source-ids.ts (FIRE_DATA_SOURCE_ID.BRASK). const BRASK_SOURCE_ID = 1 as const // Årsak → dim_code 1 per docs/ingestion/adapter-mappings/category-native-to-cat-code.md (confirm in dimensions catalog). const DIM_CAUSE = 1 as const // Årsak export async function getCrossTabByAge( client: SupabaseClient, opts: { dimCode: typeof DIM_CAUSE fireType?: 'CONFINED' | 'DEVELOPED' startYear: number endYear: number } ) { const ageBrackets = ['Ny', '1-5', '6-10', '11-15', '16-20', '21-25', '26-30', '31-40', '41-50', '51-75', '76-100', 'Over 100', 'Ukjent'] as const const matrices = await Promise.all( ageBrackets.map(async (age) => { const { data, error } = await client.rpc('get_fire_data', { p_start_date: `${opts.startYear}-01-01`, p_end_date: `${opts.endYear}-12-31`, p_sources: [BRASK_SOURCE_ID], p_dim_codes: [opts.dimCode], p_fire_type: opts.fireType ? [opts.fireType] : null, p_building_age: [age], p_breakdown_axes: [], p_granularity: 'year', p_metric: 'count', }) if (error) throw error return { ageBracket: age, rows: data ?? [] } }) ) return matrices }
Each matrix is rendered as one table or one heatmap in the UI. Parallelizing the calls keeps the page responsive.
If you want all age brackets in one call, pass:
p_building_age: ['Ny', '1-5', '6-10', '...'], p_breakdown_axes: ['building_age']
Then split results by building_age_code (filters.id) in
application code — join filters for legends.
Authoritative SQL narratives live in docs/fire-data-schema/schema-history/v2-002-filter-codes-migration-narrative.md
(history + rationale). Live shapes (PK types, FK targets, natural keys, RPC)
are summarized in docs/fire-data-schema/postgres-contract-rls-and-keys.md and generated types.
filters.id on factsfact_daily / fact_yearly store fire_type_code /
building_type_code / building_age_code / region_code as INTEGER NULL
FK → filters.id.filters rows carry native_code, standard_code, and display fields per (source_id, axis, native_code). Every row has a defined standard_code.get_fire_data accepts native or standard strings in p_fire_type etc.; it resolves them to filters.id before filtering facts. Omitted axis args should match facts where that FK IS NULL (see postgres-contract-rls-and-keys.md).næring='0' (Beboelse) is a filters catalog row → its filters.id on facts. BRASK næring='' (Alle næringer) is the unfiltered request: it is not a filter row but the building_type_code IS NULL total slice. Overlap is avoided because the NULL total is read via the omitted axis param and never summed with Beboelse.NULLS NOT DISTINCT)Unique indexes include all four nullable filter FK columns as plain integers.
CREATE UNIQUE INDEX fact_daily_natural_key ON fire_data.fact_daily ( source_id, dim_code, cat_code, date, fire_type_code, building_type_code, region_code, building_age_code ) NULLS NOT DISTINCT;
INSERT INTO fire_data.fact_daily (...) VALUES (...) ON CONFLICT ( source_id, dim_code, cat_code, date, fire_type_code, building_type_code, region_code, building_age_code ) DO UPDATE SET value = EXCLUDED.value, value_kr = EXCLUDED.value_kr, ingested_at = now();
Generated FactDaily / FactYearly use number | null for filter FK
columns. Prefer types/fire_data.database.types.ts after every migration.
The adapter writes both marginal (dim_code = 6 for Bygningsalder) and joint
(another dimension row + building_age_code FK) slices — same analytical
separation as before; only storage types changed.
Resolve each native tuple to filters.id (upsert filters
first), then write facts:
// lib/data-model/adapters/brask.ts (sketch) await upsertFactRows(client, data.map(row => ({ source_id: BRASK_SOURCE_ID, dim_code: DIM_CAUSE, cat_code: row.causeCode, date: row.date, value: row.count, value_kr: row.kr ?? null, fire_type_code: await lookupFilterId(client, { axis: 'fire_type', native: ft.native }), building_type_code: await lookupFilterId(client, { axis: 'building_type', native: bt.native }), building_age_code: await lookupFilterId(client, { axis: 'building_age', native: age.native }), region_code: null, })))
When the user activates a building_age_code filter:
building_age_code IS NULL → excluded.building_age_code IS NULL → excluded silently.building_age_code IS NULL → excluded silently.The UI must signal this exclusion when it activates the filter — "Building age available from BRASK only. Other sources excluded." This matches the filter-axis graceful-degradation pattern in conceptual-tables-and-grain.md and applies identically here.
The cross-tab use case is single-source (BRASK only) by design, so silent exclusion is the desired behaviour.
With building_age_code as a filter axis, you can now compute (without re-ingest):
fire_type_code + building_type_code + building_age_code filters together with any dimension.All of this is enabled by storing the joint distribution at ingest. None of it would be possible from marginal-only storage.
When adding a new BRASK dimension or filter axis, follow this order:
INSERT into dimensions and categories, write adapter, no schema migration.filters mapping, RPC parameter support. If the property has both forms (like BYGNINGSTYPE / building_type_code), wire the adapter to write both.Adding new analytical capability post-ingest is always cheaper than fixing a missed joint distribution. The only thing this design cannot retrofit cheaply is a new filter axis on data already ingested at coarser granularity. Hence: think before ingesting.
These are flagged for explicit later resolution:
brask.finansnorge.no and keep filters mappings aligned. Do not merge native buckets.region_code value domain — reserved column, value set not yet decided. Fylke-level seems likely.get_fire_data cross-tab-shaped queries should respond in single-digit milliseconds. Revisit if approaching millions of fact rows.These would break the design's properties — do not introduce them:
BYGNINGSTYPE dimension with building_type_code filter into a single representation. They serve different granularities; merging loses one analytical capability.Bygningsalder=alle (the rollup) and skip the per-bracket fetches. The combined slice is unrecoverable into per-bracket data.building_age_code to BRIS or SSB ingest paths without the source actually supplying age data. NULL is the correct value; fabricated values would corrupt cross-tabs.execute_sql pattern in the application. Every query goes through a typed RPC — this is enforced by the v2 schema.categories.name, and building_age_code stores the native filter value on fact rows; cross-source standardization happens through filters.standard_code.