Trends | The Landscape for Machine Learning, AI and Data (MAD) - 2023

11.03.2023 | 2 min Read
Category: Data Market | Tags: #data platform, #data lakehouse, #dataops, #architecture, #trends

Every year, Matt Turck from Firstmark (VC) publishes the landscape for Machine Learning, AI and Data (MAD). Personally, I find this overview, along with its accompanying commentary, far more useful than Gartner's technology reports in the same space. The 2023 version has arrived, and the landscape is as comprehensive as ever.

You can find a PDF version here.

Consolidation and bundling

After having had money thrown at them, there is now a myriad of small technology players. The large players will acquire some of these – they are no longer expensive after all – and expand their own capabilities. Databricks and Snowflake are both adding new features at a rapid pace, to name a few.

The modern data stack is under pressure

Many tools have emerged. We distinguish between ingestion, processing, transformation, storage, data monitoring, data catalogue, etc. For a number of organisations, the complexity has become unnecessarily high, especially when the use cases are relatively straightforward. And – it is actually rather expensive to look after all that data. The prediction is that more packaged solutions will appear going forward that reduce complexity (for example Y42 and Mozart Data).

The end of the road for ETL?

Well, I am a bit unsure about this one. However – Amazon introduced a concept they call “Zero ETL”, where data sources automatically have connectors to the databases. As soon as transactional data arrives in the operational database, it also becomes available in the analytical database. Perhaps we will see such data integrations from the larger players (Microsoft, Salesforce, etc.) soon?

Data mesh, data products and data contracts attempt to manage organisational complexity

Many large organisations have a multitude of different technologies, data sources and teams working with data. Not everything is equally streamlined. There are three concepts or philosophies that have considerable momentum right now: data mesh, data products and data contracts. The latter is perhaps the newest on the list: API-like agreements between developers who own services and data consumers, which are particularly important when data needs to be exchanged in near real time.

Convergence between different types of storage and data flow

There are players attempting to blur the distinction between real-time and batch data flows (i.e. we are moving towards real time for more), as well as the distinction between operational and analytical databases. Examples include Google with AlloyDB and Snowflake with UniStore.

AI entering the workflow for data and analytics (a bit meta)

The large language models, as exemplified by ChatGPT, are making their way into analysts’ work. They can translate natural language into SQL code, and integrated functionality is also arriving in BI tools such as Power BI and Tableau.

Here you can find the full report.

author image

Magne Bakkeli

Magne has over 20 years of experience as an advisor, architect and project manager in data & analytics, and has a strong understanding of both business and technical challenges.