Data Warehouse | What is a Data Warehouse?

28.08.2022 | 5 min Read
Tag: #data warehouse

A data warehouse is well suited for supporting reporting with data from many sources. A data warehouse seeks to organise data in a subject-oriented manner, so that the data is structured and suitable for analytical processing. A data warehouse is used to collect relevant data needed for various management purposes.

In a data warehouse, data is often stored multiple times
In a data warehouse, data is often stored multiple times

“A warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision making process”

 

Bill Inmon (often referred to as the founder of the data warehouse concept)

A data warehouse can be built on different principles

The purpose of a data warehouse can in some organisations aim to preserve the organisation’s total knowledge, without evaluating the analytical need (the Inmon model). Other times, more targeted data warehouses are built to serve a specific purpose – for example Customer Churn Analysis, Project Reporting or Profitability/ABC (the Kimball model). The latter are often referred to as Data Marts.

Typically, a data warehouse contains consolidated data from many source systems, and an important task is to ensure sufficient data quality so that you get a reliable basis for analyses and reports. Data warehouses often contain both the raw data and the fully transformed data in a common data model. Because many line-of-business systems do not retain history, this is often also the data warehouse’s responsibility. If you want to find trends and build forecasts/scenarios, good historical data is a prerequisite.

There are many types of data warehouses, both virtual (only logically defined, where the information resides in the respective sources) and physical. Confusingly, a data warehouse can be built on traditional relational databases or specialised data warehouse engines. We transform the data in multiple stages into a form that makes it easy to create reports and analyses.

A data warehouse has a setup where we have one or more layers of data processed to be stored in a common data model, and then data processed and stored for various purposes on top. We store data at least twice in such a model, both in the raw data layer and in the common data model. This can increase costs, but can provide good performance for users when data is to be retrieved.

Common use cases for data warehouses

Data warehouses are often used to support the generation of detailed reports based on structured data of high quality. This type of data platform is widely used by executives, business analysts and operational users where there is a need for data that can quickly answer common business questions, or as a basis for regular reports and dashboards. The data is well structured, easy to understand and simple to use without further processing. The data typically comes from business systems that many people use, such as the finance or CRM system.

Here are some examples of data warehouse usage across different industries:

  • Retail: Data warehouses are used to analyse sales trends over time, identify seasonal variations in demand, and optimise inventory based on predictable customer behaviour.
  • Banking and finance: Data warehouses help analyse customer portfolios, track transaction history to detect suspicious activity, and evaluate credit risk based on historical data.
  • Healthcare: Institutions use data warehouses to analyse patient data, improve patient care through pattern recognition, and optimise resource allocation across different departments and facilities.

What are data marts?

When it comes to discussions about data warehouses, you may hear the term data mart. A data mart is a subset of a data warehouse that is focused on a specific line of business or a particular subject. For example, marketing teams often use a data mart for customer data since they need fast access to standardised customer data.

Data marts are typically built as an extension of an existing data warehouse, but they can also be built as a standalone entity. The process follows how data warehouses are created, but with a more specialised end result.

Data modelling in a data warehouse

The data in a data warehouse is modelled and ready for use. The concept of data modelling in a data warehouse setting is about designing how data will be stored, organised and made available. The goal is to ensure that data is structured in a way that enables efficient analysis and reporting.

Data models are a fundamental element in a data warehouse. A data model is a description of how data is structured, and in what form the data will be stored. A data model provides a framework of relationships between data elements, in addition to guidance on the use of the data.

Dimension tables and fact tables are central concepts in data modelling for a data warehouse. Dimension tables can simply be described as what gives the numbers context (for example “Calendar” or “Location”), while fact tables contain the actual key metrics (for example “Revenue”).

Do you need a Data Engineer? Our skilled consultants can help you with data modelling, data flows and data architecture.

Advantages and disadvantages of a data warehouse

Below we summarise some important advantages and disadvantages of using a data warehouse for storage for reporting and analytics:

Advantages
  • Is an effective way to deliver insights to everyday users through self-service analytics, reports and dashboards – also known as Business Intelligence
  • Helps create a single source of truth for data through a common storage solution
  • Can help improve data standardisation and data quality
Disadvantages
  • Relatively limited ability to handle semi-structured and unstructured data
  • Ingesting and modelling new data sources can take time, and can be perceived by analysts and data scientists as cumbersome and time-consuming – they want more flexibility
  • Requires modelling to a large extent up-front, which can be perceived as an investment. Can be costly to implement and maintain

Learn more



author image

Magne Bakkeli

Magne has over 20 years of experience as an advisor, architect and project manager in data & analytics, and has a strong understanding of both business and technical challenges.