Data fabric versus data mesh: What’s the difference?

We’re excited to bring Transform 2022 back to life on July 19th and virtually July 20-28. Join AI and data leaders for sensible conversations and exciting networking opportunities. Register today!


As more and more processes move forward online during epidemics, businesses are adopting analytics to gain more insight into their operations. According to a 2021 survey by Starburst and Red Hat, 53% of companies believe that data access has become “more complex” throughout the epidemic. The results agree with the findings of Zoho’s IT division, ManageEngine, which in a 2021 poll found that more than 20% of organizations have increased their use of business analytics compared to the global average.

In the Starburst and Redhat survey, 35% said they wanted to analyze real-time business risks, while 36% said they were looking for growth and revenue generation through “more intelligent” customer engagement. But underlining the challenges in analytics, more than 37% said they did not believe in their ability to access “timely, relevant data” for decision making, whether it was different storage sources or problems developing data pipelines.

Two emerging concepts have been introduced in response to constraints in data analytics and management. One is the “data fabric,” data integration approach that involves architecture – and services that run on that architecture – to help organizations orchestrate data. Another is the “data mesh”, which aims to reduce data availability challenges by providing decentralized connectivity levels that allow companies to access data from different sources from different locations.

Both Data Fabrics and Data Mesh can serve a wide range of business, technical and organizational purposes. For example, they can save data scientists time by automating repetitive data transformation tasks while powering self-service data access tools. Data Fabrics and Data Mash can also integrate and enhance data management software already used to increase cost-effectiveness.

Data fabric

Combining technologies including AI and machine learning, the data fabric is like a weave that extends to connect data sources, types and locations with data access methods. Gartner describes it as an analysis of “existing, detectable and predictable metadata assets” to support the “design, deployment and use” of data across local, edge and data center environments.

Data Fabric constantly identifies, connects, cleans and enriches real-time data from a variety of applications to find the relationship between data points. For example, Data Fabric can inspect various data pipelines – a set of actions that ingest raw data from a source and take it to the destination – to suggest better options before automating the most repetitive tasks. Data Fabric can also “heal” failed data integration jobs, handle more complex data management aspects such as creating datasets – and profiling – and provide ways to manage and secure data by limiting who can access what data and infrastructure. doing.

To highlight the relationship between data, the data fabric creates a graph that stores interconnected descriptions of data such as objects, events, situations, and concepts. Algorithms can use this graph for analytical purposes of various businesses, such as making predictions and surfacing previously-difficult-to-find dataset stores.

K2 View, as a Data Fabric Solutions vendor, explains: “Data Fabric makes consistent provisions … data based on 360-view of a business entity, such as specific segments of customers, the company’s product line or the geography of all retail outlets in a specific use of this data. By doing so, data scientists create and refine machine learning models, while data analysts use business intelligence to analyze trends, segment customers, and perform root-cause analysis. The refined machine learning model is deployed in the data fabric, executed in real-time for the individual entity (customer, product, location, etc.) – thus ‘operationalizing’ the machine learning algorithm. The data fabric executes the machine learning model on demand, in real time, feeding it the complete and current data of the individual entity. The machine learning output is immediately returned to the requesting application and continues into the data fabric as part of the entity for future analysis.

Data fabrics often work with a range of data types, including technical, business and operational data. Ideally, they are also compatible with many different data delivery “styles” such as replication, streaming, and virtualization. In addition, Best Data Fabric Solutions provides robust visualization tools that make it easy to interpret their technical infrastructure, enabling companies to monitor storage costs, performance and efficiency – plus security – considering where their data and applications reside. Without taking.

In addition to analytics, Data Fabric offers a number of benefits to organizations, including reducing the disruption caused by switching between cloud vendors and computer resources. Data Fabric also allows enterprises – and data analytics, sales, marketing, network architects and security teams working on them – to adapt their infrastructure based on changing technology requirements to connect infrastructure endpoints, regardless of the location of the data.

In a 2020 report, Forrester found that IBM’s data fabric solution could accelerate data delivery by 60 times while increasing return on investment by 459%. But the data fabric has its downsides – the complexity of implementation is key. E.g. This lack of native interoperability can add friction such as the need to synchronize and duplicate data.

Data mash

On the other hand, there is a data mesh, which breaks down large enterprise data architectures into subsystems managed by a dedicated team. Unlike the data fabric, which relies on metadata to execute recommendations for things like data delivery, the data mesh takes advantage of the expertise of subject matter experts in overseeing the “domains” within the mesh.

“Domains” are independently deployable clusters of related microservices that communicate with users or other domains through a variety of interfaces. Micro services are made up of small services that are very loosely connected and can be configured independently.

Domains typically include code, workflow, a team, and technical environment, and teams working in domains take data as a product. Clean, fresh and complete data is delivered to any data customer based on permissions and roles, while “data products” are designed to be used for specific analytical and operational purposes.

To add value to a data mash, engineers must develop a deep understanding of datasets. They are responsible for serving data customers and organizing around the domain – i.e. testing, deployment, monitoring and domain maintenance. In addition, they should ensure that the various domains remain connected through inter-functionality and consistent data governance, standards and observability.

Data mesh promotes decentralization, plus, on the other hand, enables teams to focus on a specific set of problems. They can also promote analytics by moving forward with a business context rather than a stigma, technical knowledge.

But data mash has their drawbacks. For example, domains may inadvertently duplicate data – a waste of resources. The distributed structure of the data mesh – if the data mesh is not adequately infrastructure-agnostic – may require more technical experts to scale than centralized approaches. And technical debt may increase as domains build their own data pipelines.

Using data mesh and fabrics

When weighing pros and cons, it is important to keep in mind that data mesh and data are fabric concepts – not technology – and not mutually exclusive. The organization can adopt both data mesh and data fabric approach in specific or all departments. For James Serra, formerly a big data and data warehousing solution architect at Microsoft, the difference between the two concepts is where users are accessing data.

“Both Data Fabric and Data Mash provide an architecture for accessing data across multiple technologies and platforms, but Data Fabric is technology-focused, while Data Mesh focuses on organizational transformation,” he writes in a blog post. By). ,[A] Data mesh is more about people and processes than architecture, while data fabric is an architectural approach that smartly solves data and metadata complexities that work well together. “

Eckerson Group analyst David Wells warns against clinging to differences, which he argues are far less important than the components needed to achieve business objectives. “They are an architectural framework, not an architecture,” Wells writes in a recent blog post (also through Datanami). “Unless the framework is adapted and customized to your needs, your data, your processes and your terminology, you have no architecture.”

Suffice it to say that data fabrics and data mesh will be equally relevant for the foreseeable future. While each includes different components, they are towards the same goal of bringing more analytics to a sprawling and growing organization with a data infrastructure.

Venturebeat’s mission Digital Town Square is set to become a place for technical decision makers to gain knowledge about the changing enterprise technology and practices. Learn more about membership.

Similar Posts

Leave a Reply

Your email address will not be published.