Interfaces first: sharing and data contracts

08.02.2026 | 6 min Read
Tags: #data products, #data contract

The data product is the interface - the engine room should be refactorable without consumers noticing.

The data product is the interface

A data product is an interface towards consumption. Everything behind the interface is implementation, and should be free to change without making customers nervous.

In practice this is the difference between:

“we daren’t touch anything, because everything is intertwined”
“we can improve the engine room without phoning half the organisation”

If you take the interface seriously, sharing becomes easier, change becomes less dramatic, and people stop creating copies “just in case”.

Outside vs inside

Outside (product surface) is what you stand behind over time: semantics, keys, time logic, recommended consumption path, change practice and expectations.

Inside (engine room) is everything you should be able to change frequently: staging, intermediate layers, pipelines, optimisation and refactoring.

The interface is the contract. The inside can be changed freely as long as the output contract is upheld.

Table, model, dataset and dashboard

Table/view is a delivery format. Some tables are product surface. Many are internals.
Model is how you produce the deliverable. The model is rarely a promise to the customer.
Dataset is a useful umbrella term, but imprecise as a product surface if it does not state what is official.
Dashboard is presentation. It is typically a consumer, not a data product.

The point is to be able to say: “This is the supported surface. The rest is engine room.”

One product, multiple consumer surfaces

The most common trap in organisations that have got a little way with data products is to build parallel products for different consumers. Operations needs low latency and builds its own product. Analytics needs flexible querying and builds its own. Compliance needs an auditable format and builds its own.

The result is that “Customer 360” exists in three variants with three slightly different definitions. You have solved the reuse problem for technology, but reintroduced it for semantics.

One core, multiple outputs

A mature data product is built around one shared definition, and exposes multiple consumer surfaces from the same managed core:

              ┌─ Operations (near-real-time API)
Shared core ──┼─ Analytics (flexible querying via tables)
              └─ Compliance (standardised, auditable export)

Only the output changes, not the definition. The Customer 360 product can serve marketing segmentation, sales qualification, customer success health and AI personalisation simultaneously, from the same managed core, through different output interfaces.

Take Customer 360 - Core from the previous chapters as an example: one definition of “customer”, one logic for keys and validity. Exposed as a view for analytics, an API for operational systems, and an export for compliance. The definition lives in one place. The consumer surfaces are several.

Protect the core

As usage increases, new teams push for adaptations: “can we get a variant where active customer excludes those with only one order?”, “can we get a field that flags internal customers differently for our use?”.

If you say yes too often, the core erodes. You get duplicated transformation, semantic drift, and you are back in copy-hell. Mature data products isolate use-specific logic from the foundation model: the core stays stable and managed, while the use-specific logic lives modularly in a logical view.

Protecting the core is not rigidity. It is what makes lasting reuse and long-term trust possible.

The pattern is also why AI agents get consistent answers across channels: one definition of “Revenue”, regardless of whether the question comes via Slack, dashboard or API. More on why this matters in an AI world in chapter 8 on AI-ready data products.

A data product that cannot be shared is, in practice, an internal deliverable. And sharing is more than “publishing in the catalogue”.

Sharing consists of three steps:

Find
Understand
Get access

Many organisations improve only step 1 and are surprised that self-service does not happen. The catalogue looks nicer. Day-to-day life does not get easier.

Check whether you actually offer a complete sharing journey before discussing tools and catalogues.

On the product side, you must answer three questions:

What is this (and what is it not)?
How do I use it? (recommended consumption path + example)
What can I expect? (access, refresh, quality signal, change)

If you cannot answer these briefly, sharing ends up as private messages and verbal explanations. Social capital as an integration layer.

Within a domain, people have more context, a shared vocabulary and a shorter path to sources. A simple practice works, as long as it is clear:

shared entry point in the catalogue
clear distinction between product surface and component
short path to a point of contact
simple change practice

This is not about “more documentation”. It is about the right type of signal in the right place.

Across domains, context disappears. What was “obvious” suddenly becomes a source of misunderstanding.

You then need to be precise about:

terms (semantics)
keys and time logic
constraints
change/notification

Sharing without shared terms rarely produces reuse. It produces copies.

If you only share schema, you often just share the ability to misunderstand faster.

Take Customer 360 - Core as an example: it is precisely across domains that “who is a customer?” stops being obvious. The CRM team, customer service and sales analytics all have their own variant. A Type A product with stable semantics and controlled keys is the answer, not yet another join.

If you share externally, expectations change: the cost of errors becomes more apparent, and compliance is often what drives product surface and access decisions.

Contract, change and support must be explicit. A good sign of maturity is that you can answer:

Who do we notify when something changes?
What counts as “breaking” for external consumers?
Who receives support enquiries, and what is the normal response time?

What counts as breaking, and how to handle transitions, is covered in chapter 9 on change and deprecation.

Data contract: use it when risk and dependency warrant it

Data contracts are not a goal in themselves. They are a tool for predictability when “hoping for the best” becomes too expensive.

Consider introducing contracts when:

multiple teams are dependent
the cost of errors is high
change happens frequently

The more dependencies, the higher the risk and the more frequent the change, the greater the likelihood that you need a data contract. In many cases, a “contract light” suffices for a long time: the fields people misunderstand, keys/time logic, and change practice. If you want to standardise the format later, the Open Data Contract Standard (ODCS) is an established specification to build on.

The next step is to make the product visible and usable in the catalogue: In practice: product page, catalogue and components.

Value: why data products are worth the effort

In practice: product page, catalogue and components

Magne Bakkeli

Magne Bakkeli is co-founder and senior advisor at Glitni. He has over 25 years of experience in data platforms, data governance and data architecture, and led the Data & Analytics team at PwC Consulting for 12 years. He has built and modernised data platforms across energy, FMCG, finance and media.

Interfaces first: sharing and data contracts

The data product is the interface

Outside vs inside

Table, model, dataset and dashboard

One product, multiple consumer surfaces

One core, multiple outputs

Protect the core

How do we share data products within domains, across domains and externally?

The sharing journey: what needs to be in place?

Sharing within a domain

Sharing across domains: meaning before schema

External sharing

Data contract: use it when risk and dependency warrant it

Magne Bakkeli