Data products as the foundation for AI

08.02.2026 | 6 min Read
Tags: #data products, #AI-ready data, #agentic BI, #semantic layer, #data governance

Why analytics-ready is not the same as AI-ready, and what must be in place.

AI changes who uses the data

A data product has historically had one type of user: a human with context. An analyst who knows that “orders” excludes test customers, that “active customer” has a special definition in the Q3 report, and that there was a known error in pipeline March 12-14.

That user group is becoming a minority.

LLM agents read your documentation as authoritative. They have no colleague to ask, no Slack history to scroll through, and no gut feeling for what is wrong. Missing descriptions are filled in by the model’s own inferences. No error messages, just errors that grow larger and larger.

Tristan Handy (dbt Labs) has predicted that agent-initiated queries will become more common than human ones within 12 months. Even if the number is optimistic, the direction points to the portfolio you are building now serving a user group that is not the main consumer today.

Analytics-ready is not the same as AI-ready

Two different dimensions, two different investments:

Analytics-readyAI-ready
StructureStar schema, pre-aggregatedSemantically rich, contextual
Update frequencyDaily updates are sufficientFrequent updates preferred
MetadataOptional, helps the analystMandatory - it is the context
GranularityAggregatedFull detail with context
Business rulesIn the analyst’s headExplicitly coded

Most Norwegian data platforms are built for the first, not the second. That does not mean you have to rebuild everything. It means you have to know which data products must become AI-ready, and invest in building context for those.

Retrieval vs reasoning

David Effiong (Data Cult) distinguishes between two requirements:

  • Retrieval: do we get the right number from the right place? Solved by the data platform plus a clean semantic layer. Most modern technology stacks are good here.
  • Reasoning: does AI know what the numbers mean, when they can be trusted, and what they must not be confused with? This is solved by the context layer, and few have built anything here yet.

An AI system can fetch churn_rate for the last quarter. But to answer why, AI needs the definition, knowledge of whether the definition recently changed, an overview of known data quality issues, and any external factors. Without this, it always produces an answer - but not necessarily a correct one.

“Think of the semantic layer as a contract between your team and the AI agent. That contract has to be far more complete than when the main consumer was a human.” - David Effiong

The most effective measure: documenting what the team knows

A controlled experiment from Data Cult (Opeyemi Fabiyi, 2026) tested 13 business questions against an AI agent and added one context layer at a time, without changing the model:

IterationAccuracy
Raw sources, no context0%
Modelled table, no enrichment0%
+ rich column descriptions15%
+ business rules77%
+ metrics, examples, evaluation92%
Bar chart with five iterations: 0 percent (raw sources), 0 percent (modelled table), 15 percent (rich column descriptions), 77 percent (business rules), 92 percent (metrics and examples).
The entire accuracy improvement comes from context, not from a better model. Data: Fabiyi (2026), Data Cult.

The model was never changed. The entire improvement came from context. The biggest improvement came from writing down business rules that lived in the analysts’ heads.

A field slot_status with values Open and Filled caused the agent to filter on slot_status = 'Available', a value that does not exist in the data. The query returned zero. No error. A wrong answer that was hard to detect. A description explaining that Open means available solved it immediately.

Nine dimensions for AI maturity

Modern Data 101 (Camila Barreto Lima) has developed a 9-dimension framework for assessing the AI maturity of individual data products. It extends the practical requirements from chapters 1 to 7:

#DimensionThe question the AI agent cannot ask itself
1DiscoverabilityIs the product searchable in a catalogue?
2AddressabilityDoes it have a stable identifier that does not change?
3TrustworthinessAre quality and freshness documented?
4Self-describingnessCan a consumer understand the product without asking anyone?
5InteroperabilityCan it be consumed across systems without bespoke work?
6SecurityIs access control designed and integrated into the solution, not attached afterwards?
7ValueWho uses this for what?
8AutonomyIs it owned by a domain team without a centralised bottleneck?
9UnderstandabilityDoes the consumer understand what they actually get, including limitations?

Two dimensions are especially critical for AI consumption: Understandability (the agent cannot ask a colleague) and Trustworthiness (the agent gives no error message for stale data - just wrong answers delivered with confidence).

Pre-flight check before agent-based analysis

A data product with a low score on 1, 2 and 9 (not visible, no stable identifier, unclear semantics) gives inconsistent AI results. Use the matrix to rank the portfolio and prioritise effort:

  • Low Discoverability and Trustworthiness block the step toward data products.
  • Low Self-describingness and Understandability block the step toward agent-based analysis.

The maturity matrix takes the same questions into a structured form that lets you choose which products should be made AI-ready.

What this means for those of you who are building

You do not need to redo everything. These three measures give the best cost-benefit in an AI world:

MeasureWhat it means in practiceEffect
Column descriptions with meaning“Open means the slot is available for booking”, not “varchar(20)”The most effective measure in the Fabiyi experiment
Business rules explicitly codedDefault filters, known exceptions, definitions that vary across domains - what a new analyst learns in month 1+62 percentage points of accuracy in the same experiment
Context layer versioned as codeWhen a metric definition changes, the documentation changes with itPrevents stale documentation from generating errors at scale without anyone noticing

These are not new data products. They are what you already do in chapters 1 to 7, made clear enough for an agent to read it.

The next step is to handle changes in a product that several - including agents - build on: Change and deprecation.



author image

Magne Bakkeli

Magne Bakkeli is co-founder and senior advisor at Glitni. He has over 25 years of experience in data platforms, data governance and data architecture, and led the Data & Analytics team at PwC Consulting for 12 years. He has built and modernised data platforms across energy, FMCG, finance and media.