
Data Lakehouse | What is a Data Lakehouse?
28.08.2022 | 3 min ReadTag: #data lakehouse
As a concept, a data lakehouse is the lovechild of a data lake and a data warehouse: a data lakehouse is suited for storing and processing all forms of data related to both reporting and analytics.
Data Lakehouse - the best properties from Data Lake and Data Warehouse
A Data Lakehouse is a data architecture that combines the advantages of traditional data warehouses and data lakes. It offers a unified platform for both raw data (structured and unstructured) and modelled data, making it possible to store large amounts of raw data (as in a data lake) while simultaneously performing complex analytical and transactional queries (as in a data warehouse).

Vendors such as Databricks, Snowflake, Azure, Google Cloud and AWS all enable a comparable architecture, and there have gradually been many implementations worldwide of what can be described as “the modern data stack”.
There is always something new that can be called “modern”. These characteristics are perhaps the most important compared to how data warehouses and data lakes were originally built:
- Separation of processing and storage
- Scalability in terms of data volume, users and breadth of supported user stories
- Modularisation
In addition, an important point is that table formats such as Hudi, Apache Iceberg and Delta Lake enable logical database operations on data lake tables. ACID support means that we can, among other things, both modify and delete data – and we must be able to do so to comply with GDPR requirements.
The common features across implementations so far are the breadth of user stories supported – with the same architecture. Where we previously talked about, for example, a data warehouse, we now talk about data platforms where we can add and remove components and services as needs change. Data platforms are now also primarily built on a cloud service, primarily from Google Cloud, AWS or Azure. Reporting, machine learning/advanced analytics and real-time data are examples of user stories that can be supported by one and the same data platform.
Data platforms built on a data lakehouse architecture are unlikely to be the final destination this time either. New terms and concepts that are considered more modern or better than what was previously dominant will always emerge. There is much to look forward to. If you need assistance navigating the jungle of terminology, do not hesitate to contact us at Glitni!
Advantages and disadvantages of a data lakehouse
Below we summarise some important advantages and disadvantages of using a data lakehouse for storage for reporting and analytics:
| Advantages |
|---|
|
| Disadvantages |
|---|
|

