Conceptual tables and grain

Architectural reference for the v2 fire_data schema and lib/data-model — dimension vs filter vs metric, fact grains, and cross-source storage patterns.

Conceptual tables and grain

This document explains how HeatWaves stores fire statistics in the v2 fire_data schema and how lib/data-model maps to it. Pair with ../semantics-and-sources/taxonomy-and-eu-firestat-alignment.md for upstream source semantics; pair with postgres-contract-rls-and-keys.md for executable DDL and RPC contracts.

The three-bucket framework — dimension vs filter vs metric

This is the central classification rule. It decides where any new piece of data goes. The rule is about UI intent, not data shape — the same property could be a dimension or a filter depending on how the user will interact with it.

Bucket	UI role	Stored as	Examples
Dimension	"Show bars/series per value of this"	`dim_code + cat_code` on fact rows	Årsak `dim_code=1`, Tennkilde `2`, Objekt `3`, Arnested `4`, … — fixed ids for the EU-aligned quartet in `docs/ingestion/adapter-mappings/category-native-to-cat-code.md`; further axes get integer PKs in `dimensions`
Filter	"Apply uniformly across all bars"	nullable column on fact rows	fire_type_code, building_type_code, region_code
Metric	"Sum/average/count this number"	nullable NUMERIC column	value (count), value_kr (NOK)

The deciding question

"Does the UI show bars per value of this thing?"

Yes → dimension. Stored as dim_code + cat_code.
No, but it filters all series uniformly → filter. Nullable column.
It's a number to aggregate → metric. Numeric column. That single question resolves 95% of cases.

Why this rule works

The schema enforces no answer — you could store BYGNINGSTYPE as either a dimension or a filter and Postgres wouldn't care. The decision is about how you want to query it:

A dimension lets you produce N series at once: "fires by Kilde" → one bar per Kilde value.
A filter narrows what gets aggregated: "fires (filtered to RESIDENTIAL)" → one number; get_fire_data resolves RESIDENTIAL / native tokens to filters.id values (WHERE EXISTS …), not raw strings on facts. Both are useful; pick based on the chart you want to draw.

The same property can be both

BYGNINGSTYPE is the canonical example. It coexists in two forms:

Dimension Bygningstype — granular Matrikkel breakdown (INTEGER dim_code row in dimensions, e.g. illustrative 6 — confirm seed) — for the granular "fires by Enebolig vs Tomannsbolig vs Rekkehus" breakdown.
Filter column building_type_code (3 values: RESIDENTIAL, COMMERCIAL, OTHER) — for cross-source filtering when the user just wants "dwelling fires across BRASK and BRIS". This is intentional. Don't unify them — the granular dimension serves single-source drill-down; the coarse filter serves cross-source comparison.

The preserve-native + standardize pattern

A property that exists in multiple sources usually has different native values per source. HeatWaves stores both the native value (truth-preserving) and a standardized code (cross-source-comparable). This applies to dimensions AND filters.

For dimensions: source-scoped categories

The categories table has (source_id, dim_code, cat_code) as its primary key. BRASK's "Ildsted" and POLICE's "Ildsted" are separate rows:

source_id  dim_code  cat_code  native_name           standard_code
─────────  ────────  ────────  ────────────────────  ─────────────
1          2         7         Ildsted (…)           FIREPLACE
2          2         14        Ildsted               FIREPLACE
4          2         3         Tennkilde peis        FIREPLACE

dim_code 2 = Tennkilde (heat source / Kilde), per docs/ingestion/adapter-mappings/category-native-to-cat-code.md. cat_code values here are illustrative per-source ids.

(Illustrative source_id / cat_code — confirm source_id from sources; dimension numbering 1–4 for the EU quartet matches category-native-to-cat-code.md; additional axes 5+ come from fire_data.dimensions seed — see postgres-contract-rls-and-keys.md.)

native_name preserves the source's native vocabulary (verbatim); optional cleaned name when needed.
standard_code is the heatwaves-unified equivalence key.
Cross-source query joins on standard_code.
Native drill-down queries on (source_id, dim_code, cat_code) directly with no join. Why source-scoped: if BRASK revises a category definition, only BRASK's row changes. POLICE and DSB are unaffected. Sharing a cat_code across sources would couple them in ways that break this isolation.

For filters: standardized values + native values preserved

Same principle, applied to filter axes. BRIS exposes mission-type selection (Oppdragstype, request field revisedMissionTypes.ids). In HeatWaves that vocabulary is not a separate dimension — it is the fire_type filter axis: one row per native id in filters with axis = 'fire_type' (mission labels such as Branntilløp komfyr, Brann i bygning, Brann i skorstein), cross-walk to CONFINED / DEVELOPED on filters.standard_code. Canonical native↔standard table: docs/ingestion/adapter-mappings/filter-axes-native-to-filter-id.md §1.

Facts store fire_type_code → filters.id (INTEGER FK). That is how ingest tags each BRIS slice without inventing a separate Oppdragstype dimension row.

This means a single BRIS fact row is filtered and labeled two ways without contradiction:

Cross-source: get_fire_data using standard_code strings (CONFINED / DEVELOPED) or native ids — the RPC resolves to filters.id — or join filters on fact.fire_type_code = filters.id and filter by standard_code.
Native drill-down: join the same filters row for display_name / native_code — still distinguishes chimney vs komfyr vs full building fire inside Politi data because those remain distinct filters natives under fire_type.

The query layer chooses standard_code vs filters native columns based on the user's question.

The general rule

Ingestion stores native vocabulary in categories / filters. Fact filter columns store filters.id. Cross-source comparison uses standard_code (via RPC resolution or joins).

Whenever the situation is "N sources with different vocabularies", apply this pattern. Don't pick one source's vocabulary as canonical; don't invent a new vocabulary that ignores native values; don't normalize at ingest time. Preserve native; map to standard for comparison.

Schema layout

Two schemas

fire_data — analytical surface, exposed via PostgREST. Tables here are read by the application via RPCs and direct PostgREST queries. RLS-enabled with anonymous read policies.

ingest — operational state, hidden. Tracks ingestion bookkeeping (extraction timestamps, raw API payloads). Service-role-only, never exposed to clients. RLS-enabled with no policies (default-deny).

The split is a security boundary as much as a logical one. Adding fire_data to PostgREST's exposed-schemas list is the gate; ingest simply isn't on the list.

Tables (fire_data)

sources                  ← 5 rows: BRASK, POLICE, BRIGADE, DSB, SSB
dimensions               ← analytic axes (`dim_code` INTEGER PK, `name` human label).
                            EU-aligned quartet (fixed ids): 1 Årsak, 2 Tennkilde,
                            3 Objekt, 4 Arnested — **`category-native-to-cat-code.md`**.
                            Typical extras (authoritative PKs — see
                            [`category-native-to-cat-code.md`](../ingestion/adapter-mappings/category-native-to-cat-code.md)):
                            **5** Bygningstype, **6** Bygningsalder → **6 rows** (BRIS
                            Oppdragstype is **`fire_type`** / **`filters`**, not a dimension — **`filter-axes-native-to-filter-id.md`**).
categories               ← source-scoped (source_id, dim_code, cat_code)
fact_daily               ← DATE-grain facts (sub-year `get_fire_data` reads; sources with supports_subyear=true)
fact_yearly              ← YEAR-grain facts from ingest (SSB, BRIGADE, DSB yearly endpoints; yearly slices elsewhere)
fact_buildings_yearly    ← SSB building-stock totals (non-fire)

Tables (ingest)

extraction_state         ← per-slice "when did we last fetch this"
raw_payloads             ← optional API response audit log

Why fact tables are split by grain

fact_daily has a date DATE NOT NULL column. fact_yearly has a year SMALLINT NOT NULL column. get_fire_data chooses one fact table per call:

p_granularity = 'year' reads fact_yearly only (rows produced only by ingest at yearly grain; never summarized from fact_daily in SQL).
p_granularity ∈ day · week · month · quarter reads fact_daily only, restricted to sources where sources.supports_subyear is true (taxonomy yearly-only endpoints such as SSB, BRIGADE, DSB do not participate).

Mixing grains in one physical table forces every query to remember upstream cadence. Keeping two tables plus explicit branching keeps semantics aligned with taxonomy-and-eu-firestat-alignment.md.

Cross-source yearly dashboards still combine sources at year grain via fact_yearly. Sub-year charts omit yearly-only sources until they have compatible daily facts.

Ingest contract: fact_yearly rows come only from ingestion (upstream yearly aggregates or API yearly buckets). fact_daily rows come only from ingestion at day grain. get_fire_data never derives yearly totals by scanning fact_daily. Sources that expose both yearly summaries and daily incidents write whichever slices belong in each table; adapters decide — no Postgres rollup bridges daily→yearly.

Filter axes and graceful degradation

fire_type_code, building_type_code, building_age_code, region_code are nullable columns on the fact tables. NULL means "this source doesn't carry this filter info".

When the user activates a cross-source fire-type filter, get_fire_data resolves CONFINED / native tokens (including BRIS revisedMissionTypes ids when passed as natives) to filters.id values and applies WHERE EXISTS (… filter_id = fu.fire_type_code). Rows with fire_type_code IS NULL drop out automatically — no per-source OR logic in the RPC. This is graceful degradation by construction — the schema's most important property after the three-bucket rule.

This is why these axes are not dimensions. A dimension forces a value; a filter axis allows NULL, and NULL means "not applicable to this row".

Integrity for filter columns

Filter-axis domains are not duplicated as CHECK (... IN (...)) on facts. Instead:

Allowed natives per source live in filters (native_code, standard_code, display_name).
Facts reference filters.id; triggers fact_yearly_enforce_filter_axes / fact_daily_enforce_filter_axes invoke fire_data.enforce_fact_filter_code_axes() so each non-null id matches filters.axis and filters.applicable_dim_codes (when set) for fact.dim_code. Details: postgres-contract-rls-and-keys.md.

When adding a new native value, INSERT into filters (and migrate existing facts if needed). lib/data-model/types.ts FireTypeCode / BuildingTypeCode describe standard_code vocabulary for UI and RPC string filters — they mirror taxonomy intent, not Postgres CHECK clauses on facts. Full checklist: section Adding a new filter axis below.

Numeric spine + label tables

sources.source_id and dimensions.dim_code are INTEGER primary keys. Stable short labels (sources.name ≈ BRASK) and human titles (long_name) live on sources; dimension titles on dimensions.name.

Canonical dim_code ↔ Norwegian axis names for the EU-aligned breakdown dimensions:

`dim_code`	Typical `dimensions.name`
`1`	Årsak
`2`	Tennkilde
`3`	Objekt
`4`	Arnested

Source: docs/ingestion/adapter-mappings/category-native-to-cat-code.md. Other seeded dimensions (Bygningsalder, Bygningstype, …) each get their own integer dim_code; resolve labels via dimensions / get_dimension_catalog, not string tokens like cause on facts. BRIS mission types (Oppdragstype) live under filters (axis = 'fire_type'), not dimensions — filter-axes-native-to-filter-id.md.

Filter axes on facts store filters.id (INTEGER FK). Cross-source and URL-facing tokens such as CONFINED / RESIDENTIAL are filters.standard_code (TEXT UPPER_SNAKE_CASE) and/or native_code; get_fire_data accepts those strings and resolves them to ids internally.

cat_code stays INT (BRASK-native numeric ids). The compound key (source_id, dim_code, cat_code) prevents collisions across sources even though the integer space is shared.

Why ints for the spine: FK integrity, smaller indexes, stable joins from facts to registry rows. Labels are always one join away (sources, dimensions, filters) — which is what charts should use for legends anyway.

Surrogate id + natural-key unique index

Each fact table has both:

id BIGSERIAL PRIMARY KEY            -- for FK references and admin operations
 
CREATE UNIQUE INDEX fact_daily_natural_key ON fire_data.fact_daily (
  source_id, dim_code, cat_code, date,
  fire_type_code, building_type_code, region_code, building_age_code
) NULLS NOT DISTINCT;

NULLS NOT DISTINCT makes all-null filter slices collide the way equality would for plain values. Two rows with the same grain keys but different filter ids remain distinct. Exact definitions: docs/fire-data-schema/postgres-contract-rls-and-keys.md.

Upsert pattern for adapters

The natural-key index is the upsert target:

INSERT INTO fire_data.fact_daily (...) VALUES (...)
ON CONFLICT (
  source_id, dim_code, cat_code, date,
  fire_type_code, building_type_code, region_code, building_age_code
)
DO UPDATE SET
  value       = EXCLUDED.value,
  value_kr    = EXCLUDED.value_kr,
  ingested_at = now();

This means re-running ingestion is safe. If the upstream API returns a revised count for an existing slice, the row updates. If the slice is new, it inserts. No diff logic needed.

RPCs, never raw SQL

Application code never sends SQL strings to Postgres. All analytical queries go through parameterized RPCs in the fire_data schema.

Properties every RPC must have

CREATE OR REPLACE FUNCTION fire_data.<name>(
  p_<param1> <type>,
  ...
)
RETURNS TABLE(...) -- typed return
LANGUAGE sql       -- or plpgsql when needed
STABLE             -- so Postgres can inline / parallelize
PARALLEL SAFE      -- when the query allows
SET search_path = '' -- prevents search_path attacks; fully qualify all
                     -- table references inside the function
AS $$
  -- fully-qualified table names: fire_data.fact_daily, not fact_daily
$$;
 
GRANT EXECUTE ON FUNCTION fire_data.<name> TO anon, authenticated;

Why `SET search_path = ''`

Without it, an attacker who could create a temp table/function in the session could shadow fact_daily and run arbitrary code. With it, every table reference must be fully qualified, so there's no shadow risk. Supabase's security advisor will warn if you forget this.

NULL RPC parameters mean “unfiltered fact slices”

The get_fire_data RPC treats NULL on a filter-axis parameter as: only facts where that axis FK is NULL (upstream had no filter on that axis). Passing standard_code tokens (e.g. RESIDENTIAL) resolves to all matching filters.id rows for the requested source(s).

-- Live pattern (per axis; breakdown-aware variant in postgres-contract)
AND (p_fire_type IS NULL AND f.fire_type_code IS NULL
     OR p_fire_type IS NOT NULL AND f.fire_type_code IN (SELECT filter_id FROM resolved_fire_type))

When the axis is in p_breakdown_axes the NULL-param branch instead keeps leaf rows (fire_type_code IS NOT NULL) and groups them — see postgres-contract-rls-and-keys.md.

p_sources / p_dim_codes still use the older “NULL = no restriction” pattern — only the four filter axes use the IS NULL semantics above.

The TypeScript data layer (lib/data-model)

The application accesses the schema through a single module: lib/data-model/.

Public surface (the barrel)

Application code imports only from @/lib/data-model:

// ✓ Right
import { getCatalog, getFireData, type FactDaily } from '@/lib/data-model'

// ✗ Wrong — bypasses the public surface
import { getFireData }         from '@/lib/data-model/queries'
import { createFireDataClient } from '@/lib/data-model/client'
import type { FactDaily }       from '@/lib/data-model/types'

The barrel index.ts re-exports query helpers and types from internal files. This lets internals be refactored without touching call sites.

Internal boundary (the exception)

createFireDataClient is intentionally not re-exported from the barrel. Application code asks "give me chart data" (high-level), not "give me a database connection" (low-level). Centralizing client construction means future cross-cutting concerns (telemetry, retry, auth swap) live in one place.

The single legitimate exception is adapters inside lib/data-model/adapters/. They live inside the module's boundary and need direct Supabase access for service-role writes. They import the client directly from ./client. The "external code uses the barrel" rule doesn't apply to them because they're not external.

Files

lib/data-model/
├── index.ts        ← Public surface — re-exports queries.ts and types.ts
├── types.ts        ← Type aliases (Source, Dimension, Category, FactDaily, ...)
├── client.ts       ← INTERNAL: createFireDataClient (server + browser)
├── queries.ts      ← INTERNAL: getFireData, getCatalog, lookups
└── adapters/       ← Adapters — import client directly (legitimate exception)

Type entrypoint

The canonical types entrypoint is types/database.types.ts. It re-exports the fire_data types under an alias so migration day is a one-line change:

// types/database.types.ts (canonical entrypoint)
// Today: legacy public schema is THE Database
export type { Database } from './public.database.types'
// And fire_data is aliased
export type { Database as FireDataDatabase } from './fire_data.database.types'
 
// Migration day: swap which one is THE Database. lib/data-model untouched.

lib/data-model/* imports FireDataDatabase from this file. Don't import from fire_data.database.types directly.

Adding a new dimension (the future-proof test)

This is the schema's design test. Adding a new dimension must require zero schema changes. Specifically:

INSERT into fire_data.dimensions (one row).
INSERTs into fire_data.categories for the values the source reports.
New method on the relevant adapter that ingests rows under the new dim_code.
Redeploy the adapter. If a new dimension forces ALTER TABLE on fact_daily, the schema has failed its design goal. Investigate — usually the property you're trying to add is actually a metric or a filter axis, not a dimension.

Adding a new source

Same shape:

INSERT into fire_data.sources.
INSERTs into fire_data.categories for that source's vocabulary.
New adapter file at lib/data-model/adapters/<source>.ts.
Wire into the ingest scheduler. The chart RPC, the catalog RPC, every UI component — untouched.

Adding a new metric

A metric (insurance kr, injuries, fatalities, response time) is a nullable NUMERIC column on fact_daily and fact_yearly. Sources that don't track it leave it NULL.

ALTER TABLE add the column.
Update CHECK constraints if the metric has a domain (e.g. non-negative).
Update get_fire_data's p_metric parameter to dispatch on the new metric name.
Update TypeScript Metric literal type. This pattern scales to ~5 metrics. Beyond that, refactor to a metric-pivot table. YAGNI until then.

Adding a new filter axis

filters rows for the new axis (per source / native / standard).
ALTER TABLE nullable INTEGER FK on fact_daily and fact_yearly
- extend enforce_fact_filter_code_axes.
get_fire_data resolved CTE + parameter + breakdown projection.
Recreate fact_*_natural_key to include the new column (NULLS NOT DISTINCT).
Regenerate TS types; extend lib/data-model/types.ts literals if new standard_code values are part of the public UI contract.

Decision frameworks (rules of thumb)

Use these to resolve future design questions without long debates:

Is X a dimension, filter, or metric? Apply the three-bucket framework. "Does the UI show bars per value?" → dimension. "Apply uniformly?" → filter. "Number to aggregate?" → metric.

Where does X live in the schema?

Property of a dimension value (Årsak, Tennkilde, Objekt, Arnested, …) → categories rows (native_name, standard_code) keyed by (source_id, dim_code, cat_code).
Property of an upstream slice parameter that narrows counts uniformly (BRASK type, BRIS Oppdragstype / revisedMissionTypes, næring, age bracket, …) → filters with the right axis (fire_type, building_type, building_age, region) plus standard_code where cross-source comparison applies — facts hold INTEGER NULL FKs → filters.id. Fire-type mapping for BRIS: filter-axes-native-to-filter-id.md §1.
Property of the data itself (count, kr) → numeric column. Should X be in fire_data or ingest?
Read by the application → fire_data.
Operational metadata about ingestion → ingest. Source-scoped or shared? Always source-scoped for categories. Cross-source equivalence via standard_code. Never share cat_code between sources.

Does this change break Utforsk's source-fidelity? If yes, redesign. Utforsk shows native names without translation. Cross-source pages go through standard_code.

Add a column or a table? A new metric is a column. A new entity (fundamentally different ontological thing — e.g. building stock totals vs. fire incidents) is a table. The fact_buildings_yearly split was deliberately separate from fact_yearly for this reason.

Applying schema changes: use versioned files under supabase/migrations/ and the Supabase CLI (npm run db:migration:new, db:migration:list). Regenerate types with npm run types:gen:fire-data after DDL changes. The Supabase Dashboard SQL editor is for one-off exploration only — not the source of truth for production schema.

Constraints and conventions reference

For exact CHECK definitions, FK declarations, natural-key indexes (NULLS NOT DISTINCT), upsert SQL, RLS policies, and the RPC template, read postgres-contract-rls-and-keys.md beside this file. Use it when you need strings to copy.

Short summary: source_id and dim_code are INTEGER FKs. Fact filter columns (fire_type_code, …) store filters.id only — never native_code or standard_code. Those strings live on filters / categories and enter read paths via get_fire_data text-array parameters.

Don't-do list

These are anti-patterns the v2 schema was built to eliminate. Don't reintroduce:

No execute_sql RPC. Every query goes through a parameterized RPC. Raw SQL strings from the application is the original sin this schema exists to eliminate.
No client-side filtering of large result sets. Filters become RPC parameters. The browser receives only what it needs to render.
No business logic in API routes. Routes are thin pipes — call RPC, return JSON, handle error. Aggregation lives in Postgres.
No ambiguous filter values on facts. Facts hold filters.id; do not write raw standard strings into fact filter columns.
No shared cat_code across sources. Source-scoped is mandatory.
No NULL dim_code rows in categories. Every category belongs to exactly one dimension. The CHECK constraint enforces this.
No mixing daily and yearly in one fact table. Split by grain.
No silent WHERE filter = ANY(...) that drops sources without warning. The graceful-degradation property uses NULLs, not OR-logic. Sources that don't carry the filter return NULL rows, which the WHERE excludes automatically. This is the right behaviour but it should be explicit in the UI ("DSB excluded — does not carry fire_type info").
No raw "ALL" rollup rows in fact tables. When ingesting BRASK with bygg_type=alle, this would double-count rows that also exist as bygg_type=boliger. Ingest the granular slices, never the rollup.

Loading documentation...

Conceptual tables and grain

Architectural reference for the v2 fire_data schema and lib/data-model — dimension vs filter vs metric, fact grains, and cross-source storage patterns.

Conceptual tables and grain

The three-bucket framework — dimension vs filter vs metric

Bucket	UI role	Stored as	Examples
Dimension	"Show bars/series per value of this"	`dim_code + cat_code` on fact rows	Årsak `dim_code=1`, Tennkilde `2`, Objekt `3`, Arnested `4`, … — fixed ids for the EU-aligned quartet in `docs/ingestion/adapter-mappings/category-native-to-cat-code.md`; further axes get integer PKs in `dimensions`
Filter	"Apply uniformly across all bars"	nullable column on fact rows	fire_type_code, building_type_code, region_code
Metric	"Sum/average/count this number"	nullable NUMERIC column	value (count), value_kr (NOK)

The deciding question

"Does the UI show bars per value of this thing?"

Yes → dimension. Stored as dim_code + cat_code.
No, but it filters all series uniformly → filter. Nullable column.
It's a number to aggregate → metric. Numeric column. That single question resolves 95% of cases.

Why this rule works

The schema enforces no answer — you could store BYGNINGSTYPE as either a dimension or a filter and Postgres wouldn't care. The decision is about how you want to query it:

A dimension lets you produce N series at once: "fires by Kilde" → one bar per Kilde value.
A filter narrows what gets aggregated: "fires (filtered to RESIDENTIAL)" → one number; get_fire_data resolves RESIDENTIAL / native tokens to filters.id values (WHERE EXISTS …), not raw strings on facts. Both are useful; pick based on the chart you want to draw.

The same property can be both

BYGNINGSTYPE is the canonical example. It coexists in two forms:

Dimension Bygningstype — granular Matrikkel breakdown (INTEGER dim_code row in dimensions, e.g. illustrative 6 — confirm seed) — for the granular "fires by Enebolig vs Tomannsbolig vs Rekkehus" breakdown.
Filter column building_type_code (3 values: RESIDENTIAL, COMMERCIAL, OTHER) — for cross-source filtering when the user just wants "dwelling fires across BRASK and BRIS". This is intentional. Don't unify them — the granular dimension serves single-source drill-down; the coarse filter serves cross-source comparison.

The preserve-native + standardize pattern

For dimensions: source-scoped categories

The categories table has (source_id, dim_code, cat_code) as its primary key. BRASK's "Ildsted" and POLICE's "Ildsted" are separate rows:

source_id  dim_code  cat_code  native_name           standard_code
─────────  ────────  ────────  ────────────────────  ─────────────
1          2         7         Ildsted (…)           FIREPLACE
2          2         14        Ildsted               FIREPLACE
4          2         3         Tennkilde peis        FIREPLACE

dim_code 2 = Tennkilde (heat source / Kilde), per docs/ingestion/adapter-mappings/category-native-to-cat-code.md. cat_code values here are illustrative per-source ids.

native_name preserves the source's native vocabulary (verbatim); optional cleaned name when needed.
standard_code is the heatwaves-unified equivalence key.
Cross-source query joins on standard_code.
Native drill-down queries on (source_id, dim_code, cat_code) directly with no join. Why source-scoped: if BRASK revises a category definition, only BRASK's row changes. POLICE and DSB are unaffected. Sharing a cat_code across sources would couple them in ways that break this isolation.

For filters: standardized values + native values preserved

Facts store fire_type_code → filters.id (INTEGER FK). That is how ingest tags each BRIS slice without inventing a separate Oppdragstype dimension row.

This means a single BRIS fact row is filtered and labeled two ways without contradiction:

Cross-source: get_fire_data using standard_code strings (CONFINED / DEVELOPED) or native ids — the RPC resolves to filters.id — or join filters on fact.fire_type_code = filters.id and filter by standard_code.
Native drill-down: join the same filters row for display_name / native_code — still distinguishes chimney vs komfyr vs full building fire inside Politi data because those remain distinct filters natives under fire_type.

The query layer chooses standard_code vs filters native columns based on the user's question.

The general rule

Ingestion stores native vocabulary in categories / filters. Fact filter columns store filters.id. Cross-source comparison uses standard_code (via RPC resolution or joins).

Schema layout

Two schemas

fire_data — analytical surface, exposed via PostgREST. Tables here are read by the application via RPCs and direct PostgREST queries. RLS-enabled with anonymous read policies.

The split is a security boundary as much as a logical one. Adding fire_data to PostgREST's exposed-schemas list is the gate; ingest simply isn't on the list.

Tables (fire_data)

sources                  ← 5 rows: BRASK, POLICE, BRIGADE, DSB, SSB
dimensions               ← analytic axes (`dim_code` INTEGER PK, `name` human label).
                            EU-aligned quartet (fixed ids): 1 Årsak, 2 Tennkilde,
                            3 Objekt, 4 Arnested — **`category-native-to-cat-code.md`**.
                            Typical extras (authoritative PKs — see
                            [`category-native-to-cat-code.md`](../ingestion/adapter-mappings/category-native-to-cat-code.md)):
                            **5** Bygningstype, **6** Bygningsalder → **6 rows** (BRIS
                            Oppdragstype is **`fire_type`** / **`filters`**, not a dimension — **`filter-axes-native-to-filter-id.md`**).
categories               ← source-scoped (source_id, dim_code, cat_code)
fact_daily               ← DATE-grain facts (sub-year `get_fire_data` reads; sources with supports_subyear=true)
fact_yearly              ← YEAR-grain facts from ingest (SSB, BRIGADE, DSB yearly endpoints; yearly slices elsewhere)
fact_buildings_yearly    ← SSB building-stock totals (non-fire)

Tables (ingest)

extraction_state         ← per-slice "when did we last fetch this"
raw_payloads             ← optional API response audit log

Why fact tables are split by grain

fact_daily has a date DATE NOT NULL column. fact_yearly has a year SMALLINT NOT NULL column. get_fire_data chooses one fact table per call:

p_granularity = 'year' reads fact_yearly only (rows produced only by ingest at yearly grain; never summarized from fact_daily in SQL).
p_granularity ∈ day · week · month · quarter reads fact_daily only, restricted to sources where sources.supports_subyear is true (taxonomy yearly-only endpoints such as SSB, BRIGADE, DSB do not participate).

Cross-source yearly dashboards still combine sources at year grain via fact_yearly. Sub-year charts omit yearly-only sources until they have compatible daily facts.

Filter axes and graceful degradation

fire_type_code, building_type_code, building_age_code, region_code are nullable columns on the fact tables. NULL means "this source doesn't carry this filter info".

This is why these axes are not dimensions. A dimension forces a value; a filter axis allows NULL, and NULL means "not applicable to this row".

Integrity for filter columns

Filter-axis domains are not duplicated as CHECK (... IN (...)) on facts. Instead:

Allowed natives per source live in filters (native_code, standard_code, display_name).
Facts reference filters.id; triggers fact_yearly_enforce_filter_axes / fact_daily_enforce_filter_axes invoke fire_data.enforce_fact_filter_code_axes() so each non-null id matches filters.axis and filters.applicable_dim_codes (when set) for fact.dim_code. Details: postgres-contract-rls-and-keys.md.

Numeric spine + label tables

Canonical dim_code ↔ Norwegian axis names for the EU-aligned breakdown dimensions:

`dim_code`	Typical `dimensions.name`
`1`	Årsak
`2`	Tennkilde
`3`	Objekt
`4`	Arnested

cat_code stays INT (BRASK-native numeric ids). The compound key (source_id, dim_code, cat_code) prevents collisions across sources even though the integer space is shared.

Surrogate id + natural-key unique index

Each fact table has both:

id BIGSERIAL PRIMARY KEY            -- for FK references and admin operations
 
CREATE UNIQUE INDEX fact_daily_natural_key ON fire_data.fact_daily (
  source_id, dim_code, cat_code, date,
  fire_type_code, building_type_code, region_code, building_age_code
) NULLS NOT DISTINCT;

Upsert pattern for adapters

The natural-key index is the upsert target:

INSERT INTO fire_data.fact_daily (...) VALUES (...)
ON CONFLICT (
  source_id, dim_code, cat_code, date,
  fire_type_code, building_type_code, region_code, building_age_code
)
DO UPDATE SET
  value       = EXCLUDED.value,
  value_kr    = EXCLUDED.value_kr,
  ingested_at = now();

This means re-running ingestion is safe. If the upstream API returns a revised count for an existing slice, the row updates. If the slice is new, it inserts. No diff logic needed.

RPCs, never raw SQL

Application code never sends SQL strings to Postgres. All analytical queries go through parameterized RPCs in the fire_data schema.

Properties every RPC must have

CREATE OR REPLACE FUNCTION fire_data.<name>(
  p_<param1> <type>,
  ...
)
RETURNS TABLE(...) -- typed return
LANGUAGE sql       -- or plpgsql when needed
STABLE             -- so Postgres can inline / parallelize
PARALLEL SAFE      -- when the query allows
SET search_path = '' -- prevents search_path attacks; fully qualify all
                     -- table references inside the function
AS $$
  -- fully-qualified table names: fire_data.fact_daily, not fact_daily
$$;
 
GRANT EXECUTE ON FUNCTION fire_data.<name> TO anon, authenticated;

Why `SET search_path = ''`

NULL RPC parameters mean “unfiltered fact slices”

-- Live pattern (per axis; breakdown-aware variant in postgres-contract)
AND (p_fire_type IS NULL AND f.fire_type_code IS NULL
     OR p_fire_type IS NOT NULL AND f.fire_type_code IN (SELECT filter_id FROM resolved_fire_type))

When the axis is in p_breakdown_axes the NULL-param branch instead keeps leaf rows (fire_type_code IS NOT NULL) and groups them — see postgres-contract-rls-and-keys.md.

p_sources / p_dim_codes still use the older “NULL = no restriction” pattern — only the four filter axes use the IS NULL semantics above.

The TypeScript data layer (lib/data-model)

The application accesses the schema through a single module: lib/data-model/.

Public surface (the barrel)

Application code imports only from @/lib/data-model:

// ✓ Right
import { getCatalog, getFireData, type FactDaily } from '@/lib/data-model'

// ✗ Wrong — bypasses the public surface
import { getFireData }         from '@/lib/data-model/queries'
import { createFireDataClient } from '@/lib/data-model/client'
import type { FactDaily }       from '@/lib/data-model/types'

The barrel index.ts re-exports query helpers and types from internal files. This lets internals be refactored without touching call sites.

Internal boundary (the exception)

Files

lib/data-model/
├── index.ts        ← Public surface — re-exports queries.ts and types.ts
├── types.ts        ← Type aliases (Source, Dimension, Category, FactDaily, ...)
├── client.ts       ← INTERNAL: createFireDataClient (server + browser)
├── queries.ts      ← INTERNAL: getFireData, getCatalog, lookups
└── adapters/       ← Adapters — import client directly (legitimate exception)

Type entrypoint

The canonical types entrypoint is types/database.types.ts. It re-exports the fire_data types under an alias so migration day is a one-line change:

// types/database.types.ts (canonical entrypoint)
// Today: legacy public schema is THE Database
export type { Database } from './public.database.types'
// And fire_data is aliased
export type { Database as FireDataDatabase } from './fire_data.database.types'
 
// Migration day: swap which one is THE Database. lib/data-model untouched.

lib/data-model/* imports FireDataDatabase from this file. Don't import from fire_data.database.types directly.

Adding a new dimension (the future-proof test)

This is the schema's design test. Adding a new dimension must require zero schema changes. Specifically:

INSERT into fire_data.dimensions (one row).
INSERTs into fire_data.categories for the values the source reports.
New method on the relevant adapter that ingests rows under the new dim_code.
Redeploy the adapter. If a new dimension forces ALTER TABLE on fact_daily, the schema has failed its design goal. Investigate — usually the property you're trying to add is actually a metric or a filter axis, not a dimension.

Adding a new source

Same shape:

INSERT into fire_data.sources.
INSERTs into fire_data.categories for that source's vocabulary.
New adapter file at lib/data-model/adapters/<source>.ts.
Wire into the ingest scheduler. The chart RPC, the catalog RPC, every UI component — untouched.

Adding a new metric

A metric (insurance kr, injuries, fatalities, response time) is a nullable NUMERIC column on fact_daily and fact_yearly. Sources that don't track it leave it NULL.

ALTER TABLE add the column.
Update CHECK constraints if the metric has a domain (e.g. non-negative).
Update get_fire_data's p_metric parameter to dispatch on the new metric name.
Update TypeScript Metric literal type. This pattern scales to ~5 metrics. Beyond that, refactor to a metric-pivot table. YAGNI until then.

Adding a new filter axis

filters rows for the new axis (per source / native / standard).
ALTER TABLE nullable INTEGER FK on fact_daily and fact_yearly
- extend enforce_fact_filter_code_axes.
get_fire_data resolved CTE + parameter + breakdown projection.
Recreate fact_*_natural_key to include the new column (NULLS NOT DISTINCT).
Regenerate TS types; extend lib/data-model/types.ts literals if new standard_code values are part of the public UI contract.

Decision frameworks (rules of thumb)

Use these to resolve future design questions without long debates:

Is X a dimension, filter, or metric? Apply the three-bucket framework. "Does the UI show bars per value?" → dimension. "Apply uniformly?" → filter. "Number to aggregate?" → metric.

Where does X live in the schema?

Property of a dimension value (Årsak, Tennkilde, Objekt, Arnested, …) → categories rows (native_name, standard_code) keyed by (source_id, dim_code, cat_code).
Property of an upstream slice parameter that narrows counts uniformly (BRASK type, BRIS Oppdragstype / revisedMissionTypes, næring, age bracket, …) → filters with the right axis (fire_type, building_type, building_age, region) plus standard_code where cross-source comparison applies — facts hold INTEGER NULL FKs → filters.id. Fire-type mapping for BRIS: filter-axes-native-to-filter-id.md §1.
Property of the data itself (count, kr) → numeric column. Should X be in fire_data or ingest?
Read by the application → fire_data.
Operational metadata about ingestion → ingest. Source-scoped or shared? Always source-scoped for categories. Cross-source equivalence via standard_code. Never share cat_code between sources.

Does this change break Utforsk's source-fidelity? If yes, redesign. Utforsk shows native names without translation. Cross-source pages go through standard_code.

Constraints and conventions reference

Don't-do list

These are anti-patterns the v2 schema was built to eliminate. Don't reintroduce:

No execute_sql RPC. Every query goes through a parameterized RPC. Raw SQL strings from the application is the original sin this schema exists to eliminate.
No client-side filtering of large result sets. Filters become RPC parameters. The browser receives only what it needs to render.
No business logic in API routes. Routes are thin pipes — call RPC, return JSON, handle error. Aggregation lives in Postgres.
No ambiguous filter values on facts. Facts hold filters.id; do not write raw standard strings into fact filter columns.
No shared cat_code across sources. Source-scoped is mandatory.
No NULL dim_code rows in categories. Every category belongs to exactly one dimension. The CHECK constraint enforces this.
No mixing daily and yearly in one fact table. Split by grain.
No silent WHERE filter = ANY(...) that drops sources without warning. The graceful-degradation property uses NULLs, not OR-logic. Sources that don't carry the filter return NULL rows, which the WHERE excludes automatically. This is the right behaviour but it should be explicit in the UI ("DSB excluded — does not carry fire_type info").
No raw "ALL" rollup rows in fact tables. When ingesting BRASK with bygg_type=alle, this would double-count rows that also exist as bygg_type=boliger. Ingest the granular slices, never the rollup.

Conceptual tables and grain

Conceptual tables and grain

The three-bucket framework — dimension vs filter vs metric

The deciding question

Why this rule works

The same property can be both

The preserve-native + standardize pattern

For dimensions: source-scoped categories

For filters: standardized values + native values preserved

The general rule

Schema layout

Two schemas

Tables (fire_data)

Tables (ingest)

Why fact tables are split by grain

Filter axes and graceful degradation

Integrity for filter columns

Numeric spine + label tables

Surrogate id + natural-key unique index

Upsert pattern for adapters

RPCs, never raw SQL

Properties every RPC must have

Why SET search_path = ''

NULL RPC parameters mean “unfiltered fact slices”

The TypeScript data layer (lib/data-model)

Public surface (the barrel)

Internal boundary (the exception)

Files

Type entrypoint

Adding a new dimension (the future-proof test)

Adding a new source

Adding a new metric

Adding a new filter axis

Decision frameworks (rules of thumb)

Constraints and conventions reference

Don't-do list

Conceptual tables and grain

Conceptual tables and grain

The three-bucket framework — dimension vs filter vs metric

The deciding question

Why this rule works

The same property can be both

The preserve-native + standardize pattern

For dimensions: source-scoped categories

For filters: standardized values + native values preserved

The general rule

Schema layout

Two schemas

Tables (fire_data)

Tables (ingest)

Why fact tables are split by grain

Filter axes and graceful degradation

Integrity for filter columns

Numeric spine + label tables

Surrogate id + natural-key unique index

Upsert pattern for adapters

RPCs, never raw SQL

Properties every RPC must have

Why SET search_path = ''

NULL RPC parameters mean “unfiltered fact slices”

The TypeScript data layer (lib/data-model)

Public surface (the barrel)

Internal boundary (the exception)

Files

Type entrypoint

Adding a new dimension (the future-proof test)

Adding a new source

Adding a new metric

Adding a new filter axis

Decision frameworks (rules of thumb)

Constraints and conventions reference

Don't-do list

Why `SET search_path = ''`

Why `SET search_path = ''`