Data products as the foundation for AI

08.02.2026 | 6 min Read
Tags: #data products, #AI-ready data, #agentic BI, #semantic layer, #data governance

Why analytics-ready is not the same as AI-ready, and what must be in place.

AI changes who uses the data

A data product has historically had one type of user: a human with context. An analyst who knows that “orders” excludes test customers, that “active customer” has a special definition in the Q3 report, and that there was a known error in pipeline March 12-14.

That user group is becoming a minority.

LLM agents read your documentation as authoritative. They have no colleague to ask, no Slack history to scroll through, and no gut feeling for what is wrong. Missing descriptions are filled in by the model’s own inferences. No error messages, just errors that grow larger and larger.

Tristan Handy (dbt Labs) has predicted that agent-initiated queries will become more common than human ones within 12 months. Even if the number is optimistic, the direction points to the portfolio you are building now serving a user group that is not the main consumer today.

Analytics-ready is not the same as AI-ready

Two different dimensions, two different investments:

	Analytics-ready	AI-ready
Structure	Star schema, pre-aggregated	Semantically rich, contextual
Update frequency	Daily updates are sufficient	Frequent updates preferred
Metadata	Optional, helps the analyst	Mandatory - it is the context
Granularity	Aggregated	Full detail with context
Business rules	In the analyst’s head	Explicitly coded

Most Norwegian data platforms are built for the first, not the second. That does not mean you have to rebuild everything. It means you have to know which data products must become AI-ready, and invest in building context for those.

Retrieval vs reasoning

David Effiong (Data Cult) distinguishes between two requirements:

Retrieval: do we get the right number from the right place? Solved by the data platform plus a clean semantic layer. Most modern technology stacks are good here.
Reasoning: does AI know what the numbers mean, when they can be trusted, and what they must not be confused with? This is solved by the context layer, and few have built anything here yet.

An AI system can fetch churn_rate for the last quarter. But to answer why, AI needs the definition, knowledge of whether the definition recently changed, an overview of known data quality issues, and any external factors. Without this, it always produces an answer - but not necessarily a correct one.

“Think of the semantic layer as a contract between your team and the AI agent. That contract has to be far more complete than when the main consumer was a human.” - David Effiong

The most effective measure: documenting what the team knows

A controlled experiment from Data Cult (Opeyemi Fabiyi, 2026) tested 13 business questions against an AI agent and added one context layer at a time, without changing the model:

Iteration	Accuracy
Raw sources, no context	0%
Modelled table, no enrichment	0%
+ rich column descriptions	15%
+ business rules	77%
+ metrics, examples, evaluation	92%

Bar chart with five iterations: 0 percent (raw sources), 0 percent (modelled table), 15 percent (rich column descriptions), 77 percent (business rules), 92 percent (metrics and examples). — The entire accuracy improvement comes from context, not from a better model. Data: Fabiyi (2026), Data Cult.

The model was never changed. The entire improvement came from context. The biggest improvement came from writing down business rules that lived in the analysts’ heads.

A field slot_status with values Open and Filled caused the agent to filter on slot_status = 'Available', a value that does not exist in the data. The query returned zero. No error. A wrong answer that was hard to detect. A description explaining that Open means available solved it immediately.

Nine dimensions for AI maturity

Modern Data 101 (Camila Barreto Lima) has developed a 9-dimension framework for assessing the AI maturity of individual data products. It extends the practical requirements from chapters 1 to 7:

#	Dimension	The question the AI agent cannot ask itself
1	Discoverability	Is the product searchable in a catalogue?
2	Addressability	Does it have a stable identifier that does not change?
3	Trustworthiness	Are quality and freshness documented?
4	Self-describingness	Can a consumer understand the product without asking anyone?
5	Interoperability	Can it be consumed across systems without bespoke work?
6	Security	Is access control designed and integrated into the solution, not attached afterwards?
7	Value	Who uses this for what?
8	Autonomy	Is it owned by a domain team without a centralised bottleneck?
9	Understandability	Does the consumer understand what they actually get, including limitations?

Two dimensions are especially critical for AI consumption: Understandability (the agent cannot ask a colleague) and Trustworthiness (the agent gives no error message for stale data - just wrong answers delivered with confidence).

Pre-flight check before agent-based analysis

A data product with a low score on 1, 2 and 9 (not visible, no stable identifier, unclear semantics) gives inconsistent AI results. Use the matrix to rank the portfolio and prioritise effort:

Low Discoverability and Trustworthiness block the step toward data products.
Low Self-describingness and Understandability block the step toward agent-based analysis.

The maturity matrix takes the same questions into a structured form that lets you choose which products should be made AI-ready.

What this means for those of you who are building

You do not need to redo everything. These three measures give the best cost-benefit in an AI world:

Measure	What it means in practice	Effect
Column descriptions with meaning	“Open means the slot is available for booking”, not “varchar(20)”	The most effective measure in the Fabiyi experiment
Business rules explicitly coded	Default filters, known exceptions, definitions that vary across domains - what a new analyst learns in month 1	+62 percentage points of accuracy in the same experiment
Context layer versioned as code	When a metric definition changes, the documentation changes with it	Prevents stale documentation from generating errors at scale without anyone noticing

These are not new data products. They are what you already do in chapters 1 to 7, made clear enough for an agent to read it.

The next step is to handle changes in a product that several - including agents - build on: Change and deprecation.

Quality and reliability

Change, versioning and deprecation practice

Magne Bakkeli

Magne Bakkeli is co-founder and senior advisor at Glitni. He has over 25 years of experience in data platforms, data governance and data architecture, and led the Data & Analytics team at PwC Consulting for 12 years. He has built and modernised data platforms across energy, FMCG, finance and media.