Rulebook and portfolio

08.02.2026 | 7 min Read
Tags: #data products, #data mesh, #data governance, #data product owner

Keep the word 'data product' useful: product vs component, lifecycle, domains and ownership.

Rulebook: What we mean when we say “data product”

The term loses value when everything is called a product. The rulebook below is designed to keep the word data product practically useful: few products with high signal value, many components without false promises.

Ten rules that separate product from component

1 No governance –> no product status. If you do not have the capacity to uphold a promise over time, “product” is just an expensive label.

2 The owner must be named — and have decision-making authority. “Owner” means: who can decide on definitions, prioritisation and change — and who responds when something breaks?

3 The data product must have named customers. If the target audience is “everyone”, it is a sign that you do not know who uses it.

4 The data product must have a clear use case. Not “for analytics”, but “to make X possible for Y”.

5 The promise must be clear enough that others dare to build on it. Access, refresh, quality signals and change practices are usually sufficient.

6 Product surface first, implementation second. What you promise is the interface towards usage. The engine room should be refactorable without everything becoming an incident.

7 Critical dependencies are part of the responsibility. If the promise depends on pipelines, reference tables or quality jobs, the product team must own the consequences, even if the implementation may reside in several places.

8 Quality means “good enough for use”. A few tests that target the actual usage beat many generic tests you never act on.

9 Usage must be visible. If you do not know what is being used, you cannot prioritise, govern or deprecate in an orderly way either.

10 Lifecycle must be defined. “Pilot”, “active”, “deprecating” and “retired” is a simple way to prevent everything from lingering forever.

The fewer real data products, the higher the trust in them.
The fewer real data products, the higher the trust in them.

“Team” and “owner” does not mean a Slack channel

Product status only makes sense when you can point to four things — briefly, concretely and without hero stories:

  • Mandate: who makes decisions about definitions and change?
  • Capacity: who has the capacity for incidents and requests?
  • Contact point: where do customers get answers without being dependent on a specific person?
  • Run cost: who owns the prioritisation of operations/compute/support? (It is sufficient to own the prioritisation. You do not need to charge back every query.)

If you do not have this, it is better to call the deliverable a component for now.

Lifecycle: make status visible

Product vs component is about what something is. Lifecycle is about how safe it is to build on right now, and what happens when it changes. Without a visible lifecycle status, the catalogue quickly becomes a list of “things that exist”, and sharing becomes person-dependent: people ask in private messages whether this is supported, whether the definitions are stable, or whether it is on its way out.

A simple status on the product page is enough to make the catalogue more truthful:

  • Draft – idea/work in progress, no promises
  • Pilot – a few customers, limited promise
  • Active – the product promise applies
  • Deprecating – replacement is defined, deadline and migration are described
  • Retired / downgraded – no longer a product (component or removed)

The point is not to introduce a governance regime. The point is to make expectations explicit before someone builds two quarters’ worth of logic on something you were actually planning to discard. Fewer surprises, clearer prioritisation, tidier clean-up.

Promoted when there are named consumers, capacity to uphold the promise and a chosen product surface. Downgraded when there is no usage, no owner or changed domain responsibility.
Promoted when there are named consumers, capacity to uphold the promise and a chosen product surface. Downgraded when there is no usage, no owner or changed domain responsibility.

Three tiers that make the portfolio more pragmatic

In practice, you need a language for three types of deliverables:

  • Data product: managed product surface with customers, a promise and a lifecycle
  • Component: building block (table, model, pipeline) without a product promise
  • Experiment: temporary deliverable for learning, can be upgraded later

The litmus test: If you remove the deliverable tomorrow, do you know who would miss it? Do they know who to call?


Portfolio management: product, component or experiment?

Once you have a language, you can manage the portfolio. Two steps that deliver the most impact per calorie:

  1. Give product status to the few things that are actually important and shared.
  2. Give component status to the rest — but make ownership and purpose visible.

Data product or component? A simple decision table

SituationData product when…Component when…
Reusemultiple teams build on itone team uses it locally
Risk of failurefailure is costly (money/compliance/governance)failure is mostly annoying
Changechange must be handled in a controlled mannerchange tolerates more ad hoc
Customers/valueyou can point to customers and purpose“maybe someone needs this”
Governance capacityyou can uphold a promiseyou do not have the capacity (and that is ok)

Components: minimum practice

Minimum practices for components:

  • which domain it belongs to
  • who owns it (team) and contact point
  • one sentence about purpose
  • classification (especially for sensitivity/PII)
  • simple lifecycle status (active / under change / being phased out)
  • lineage/dependencies (roughly)

When there are named customers who would complain if this changed without notice, you are in data product territory.


Ownership and domains: who decides, and who governs?

This is where many “data product” initiatives fall short: people agree on a definition, but not on who will stand behind it when everyday reality kicks in.

What do we mean by “domain” here?

A domain is an area where someone has the mandate to define concepts and rules, because they own the process and the consequences.

This does not mean the domain map must be perfect before you start. It means you need an address for disagreement.

Data ownership: the part you cannot avoid

Two things that are often conflated:

  1. Business ownership of semantics and rules Who decides the definition when there is disagreement?

  2. Product/operational ownership of the product surface Who manages the promise to the customers: access, quality signals, change and support?

This can be the same team or split, but the responsibility must be clear. Otherwise you get a lot of “we thought you owned it”.

Platform team vs product team

  • The platform team delivers standards and “golden paths” (build, test, deploy, catalogue, access, observability).
  • The product team owns semantics, product surface, prioritisation and change.
  • The domain owner/business shows up when definitions need to be settled.

Scale: how many data products can you sustain?

As many as you can manage with quality, with customers, a promise and a team that responds.

The number of data products should not be driven by the number of tables, but by the number of product surfaces you have the capacity to keep stable over time.

  • Too few –> monolith: one large deliverable nobody dares to change
  • Too many –> low signal value: hard to find “the recommended one”

The latter is more common than the former. The result typically looks like this: 1,000 entries in the catalogue, 13 real products — and nobody can find them, because the other 987 components claim to be the same thing. The catalogue gives zero confidence, and people ask in private messages instead.

Choose 1-2 usage signals and start simply: consumption in BI/semantic layer, query/access logs on product surfaces, number of named consumer environments in the catalogue/README, or upstream/downstream in pipelines.

Call something a data product and it means something — or it means nothing. That is the choice.
Call something a data product and it means something — or it means nothing. That is the choice.

Typology: four common data product types

A typology makes it easier to choose the right level of expectations, and to avoid everything ending up in the same “generic dataset” box.

TypeWhat it isTypical customersTypical product surfaceTypical promise
A Master and reference dataIdentity and references that must be consistent (customer, product, org, calendar)Many domainsTable/view + history, optionally APIStable identity + controlled change
B Domain foundation (events/facts)Orders, payments, measurements, returns – with time logicMultiple teamsTable/view, event feed, APISemantics + keys + time logic
C Use-case-oriented data productsBuilt for a process/end result (impact, planning, risk)Product/process teamsDataset, metrics, feature storeFit-for-use for one consumption surface
D External data productsShared with partners/customersExternalAPI, export, event feedContract, support, strict change management

In the next chapter we use a Type A product (Customer 360 – Core) as the running example for the business canvas and MVDP: Business canvas and MVDP.



author image

Magne Bakkeli

Magne Bakkeli is co-founder and senior advisor at Glitni. He has over 25 years of experience in data platforms, data governance and data architecture, and led the Data & Analytics team at PwC Consulting for 12 years. He has built and modernised data platforms across energy, FMCG, finance and media.