Data mesh: What it is and why you should care

Did you miss a session from the Future of Work Summit? Visit our Future of Work Summit on-demand library to stream.


This article was contributed by Bruno Aziza, Head of Data and Analytics at Google Cloud

“Data Mash” is a term that most vendors, educators and data pundits seem to have come together to define data, AI and analytics as one of the most disruptive trends in the world. According to Google Trends, in 2021, “Data Mash” overtook “Data Lackhouse”, which was by far the most popular in the industry.

Simply put, if you work in technology, you won’t be able to escape the data mesh in 2022.

Data Mash: A simple definition

The origin of the data mesh comes from a paper written in May 2019 by Zamak Dehghani. In this section, Thoughtwork Consultant describes the limitations of centralized, monolithic and domain agnostic data platforms.

These platforms often take the form of proprietary enterprise data warehouses that contain “thousands of unrelated ETL jobs, tables and reports that only a small group of specialized people understand, resulting in less positive impact on business,” or complex data leaks. Which is managed by a central team of “hyper-specialized data engineers who [have]The best case, the competent pocket of R&D analytics, according to Dehghani. The latter case is often referred to as the “data swamp”, a data lake where all kinds of information remain static, unusable and ultimately useless.

Data Mesh intends to offer solutions to these problems by focusing on domain-based design to achieve a balance between centralization and decentralization of metadata and data management and by guiding leaders towards a “modern data stack”.

One of the best explanations and implementations of the data mesh concept I’ve read so far is in a two-part series by L’Oreal CIO Francois Nguyen entitled “Towards the Data Mesh” (Part 1, Part 2).

If you haven’t read it yet, stop everything and do it now. There is no better guide than practitioners who test theories in practice and report real-world findings on their data journey. Francois’s paper Data Mash is full of useful guidance on how to form your data team and guide the organization. His blog “Part Dukes” provides accurate, tested and technical guidance on how to successfully implement Data Mash.

Remember that a data mesh is more than a technical architecture; It’s a way to organize yourself around data ownership and activation. When used successfully, the data mesh becomes the foundation of a modern data stack based on six main principles. For your data mesh to work, the data must be 1) searchable, 2) addressable, 3) trustworthy, 4) self-explanatory, 5) inter-operable and 6) secure.

In my opinion, the seventh dimension should be added to the data mesh concept: financially responsible and financially accurate. The biggest challenge (and opportunity) of distributed and modern data stacks is the proper allocation (and cost) of resources for domains.

Many will interpret this comment as a “cloud costs more for you” argument. Not what I’m referring to. In fact, I believe that cost evaluation should not be done in isolation. It has to do with business value: If your company can quickly get more value out of data by investing in a modern (and responsible) data mash in the cloud, you should invest more.

The biggest problems in this area are not about lack of data or lack of investment. They have been about lack of value. According to Accenture, about 70% of organizations still cannot get value from their data.

Don’t be distracted by fame

If your ultimate goal is to drive “business value” out of data, how does the data mesh concept help you? Perhaps one of your biggest challenges this year is avoiding the buzzword euphoria surrounding this term. Instead, focus on using the data mesh as a way to reach your ultimate goal.

There are two main concepts to consider:

Data mash is not the beginning

In a recent post, my friend Andrew Burst noted that “scattering is a natural state of operational data” and that “overall operational data is considered to be corpus scattering. It was obtained that way by optimization, not by incompetence.” In other words, the data you need is supposed to be in a distributed state. It will be on-premises, it will be in the cloud, it will be in multiple clouds. Ask your team: “Have we invented all the data we need? Do we understand where all this is? ”

Remember that, according to Dehghani’s original paper, for your data mesh to work, your data must be “searchable, addressable, reliable, self-described, inter-managed and secure”. This assumes that the data mesh is a stage before the stage.

I have had the privilege of spending a lot of time with many data leaders, and the best description I’ve ever heard of what that stage might be is Vodafone’s “Data Sea” by Johann Weiberg and Simon Harris. The data ocean landlock is wider than the data lex concept. It focuses on providing secure full visibility to the entire data estate available to data teams to understand their potential, without necessarily moving.

Data mash is not the end

Now that we have established that a data mash needs a data foundation to work successfully, let us know what a data mash leads you to. If your goal is to generate value from data, how do you realize the results of your data mash? This is where data products come into play.

We know that the value of data comes from its use and its use. I’m not referring to a simple dashboard here. I refer to intelligent and rich data products that trigger actions to create value and protect your people and business. Think of search engine optimization for your networks, fraud predictions for your bank accounts, or recommendation engines that create the best customer experiences in real time.

In other words, while the data ocean is the architectural foundation needed to set up your data mesh for success, the data mesh is the organizational model that enables your team to create data products. If every company is a “data company”, its currency is the “data products” that it can output, its repeatability and its reliability. This is a concept that McKinsey Analytics created a “data factory”.

What should you worry about?

As you read more about the data mesh concept throughout the year, you will probably hear from three types of people: disciples, distracted and distorted.

Disciples will encourage you to return to the original paper or contact Dehghani directly if you have questions. You can also order his book, which is coming out soon.

There will be distracting pundits or vendors who will want to label the concept of “data mash” as a fade or old trend: “Look away!” They will say, “Nothing new here!” Be careful. Innovation is related to your current status. Go back to Genesis and decide for yourself whether this concept is new to you, your team and your organization.

The perverts will probably be vendors (software, vendors, services) who will directly benefit from drawing a straight line from flesh paper on their product, solution or services. Be careful. As my friend Eric Broda explains in his Data Mash Architecture blog, “There is no single product that brings data mash to you.”

The best solution, I think, is to engage with practitioners. Leaders who have practiced theory and who are willing to share their teachings.

Bruno Aziza is the head of data and analytics at Google Cloud,

DataDecisionMakers

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including tech people working on data, can share data-related insights and innovations.

If you would like to read about the latest ideas and latest information, best practices and the future of data and data tech, join us at DataDecisionMakers.

You might even consider contributing to your own article!

Read more from DataDecisionMakers

Similar Posts

Leave a Reply

Your email address will not be published.