6 key trending MDM features to drive business outcomes
Trends in Master Data Management and the use-cases that will build a strong foundation for your data platform.
Is your MDM journey on the right trajectory in the AI-driven world? Are you focused on augmenting your MDM platform with Machine Learning? What about LLMs and MDM relevance? Let's put all of these questions in context and answer them all…
Master Data Management – A ‘Business Trust’ since the early 2000s
Business stakeholders demand clean, meaningful, and trustworthy data that help them drive business decisions with confidence. Master Data Management is a type of data architecture that provides a platform to govern the quality, access, and ownership of data. It helps integrate the right data entities from the master data tool into the respective targets or Lines of Business (LOBs) that are internal to the organizations like ERP, CRM, HR, Finance or Data warehouse, reporting, or BI systems. The target can also be external to organizations like syndicating with a Data pool like Global Data Synchronization Network — GDSN (example: 1World Sync, GS1-Net, etc.) or E-Com websites, or 3rd party Vendors, Trading-Partners like out-sourced manufacturers, packagers, bottlers, etc…
There are multiple MDM tools and platforms available, and most of them are loaded with master data capabilities. The well-known magic quadrant leaders and niche players in these segments are TIBCO EBX, Informatica, Semarchy, Stibo, Reltio, Profisee, PiLog, and Syndigo. If you are a new buyer you may consider evaluating TIBCO EBX as a fully rounded MDM tool from capabilities and overall potential to be a singular platform for all MDM needs. However, the following content is a tool-agnostic view of MDM trends.
Overall, MDM is a key component of the Data Architecture for your organizations to manage critical data assets that are Parties like B2B or B2C Customers, Suppliers, Vendors or Partners, Legal Entities, or Things like Products, Materials, Services, Assets, Locations, Plants. A good Master Data tool is also capable of managing Enterprise Reference Data like Geographies, Business Codes, Charts of Accounts, and Industry Classification codes. Reference Data is a key, essential data element that can be used with your Master Data.
Along with managing key data assets in a multi-domain aspect, MDM tools should also be multi-vector, i.e., a single tool serving various personas like Data Stewards, Data Approvers, Data Analysts, Data Admins, etc. The key is a single platform that can bring all the Data Stakeholders together so that they can have a common understanding of data, govern the lifecycle of master data, and build trust.
Forward-Looking Trends in MDM Platforms
As an Enterprise Data Architect and a long-time MDM Practitioner, I always get these questions about MDM's relevance in today’s business needs.
I have seen multiple different waves like Big-Data, Cloud, and De-Centralization of data driven through Blockchain, and now Artificial Intelligence (AI).
During all these trends, data has always been a critical element that fuels business needs. Effectively managing the data to solve business problems with the right data architecture is a key to success in my mind. Trends will keep coming, and some of them stay relevant and enhance the data management experience, and some prove to be irrelevant eventually. One such trend that we have today is various flavors of Artificial Intelligence (AI) such as GenerativeAI (GenAI). AI is here to stay, as it has been since the 1950s. Furthermore, it is evident that AI will only help effectively manage data to serve business needs.
So, let's focus on the market outlook to determine the trending capabilities that will solidify the usage patterns of MDM in our organizations.
Before we dive into my point of view on the new trends, I do want to start with the overall Global Market Outlook for MDM as determined by the ResearchAndMarket.com report. The report says –
The master data management market size is to grow from USD 16.7 billion in 2022 to USD 34.5 billion by 2027, at a Compound Annual Growth Rate (CAGR) of 15.7% during the forecast period. Various factors such as the rising need for compliance, the increase in the use of data quality tools for data management, and the rising trend towards multi-domain master data management are expected to drive the adoption of master data management technologies and services. 1
OpenAI & Large Language Model (LLM) integration
In the past few months, the trend that the tech world is seen for OpenAI (generative pre-trained transformation) GPT Models is a phenomenon. Especially, a shift from small-scale models to large-scale models. It is obvious that software applications & tools like MDM or Data Providers would start taking advantage of these AI Models to integrate with LLM and enable natural language interfaces to interchange and enrich data needs. Per the Gartner Hype-Cycle2, LLM-based integrations are in their early stages and only early adopters’ use cases are in discovery phases with very few authoritative reference architectures available. There is the unpredictability of responses that may not be contextualized for specific use. Moreover, the LLM API Cost is very high for rationalizing such integration at the moment of writing this article.
When LLM Integration becomes mainstream, this is the area to watch for. So with that context, the possible use cases in the context of Modern MDM can be as follows-
USE-CASE: MDM tools are known to integrate with external sources for various data enrichment needs to build the best version of data from the information available. For example, Integration with Duns & Bradstreet for the enrichment of B2B Organization information, and integration with credit-reporting data providers like Experian, Trans-Union, etc., to get accurate data on B2C consumers. Every industry vertical will have such leading industry-specific data sources that help enrich master data. When LLM models are more predictive and reliable, these models may either be additional sources for master data enrichment or the industry-specific data providers may also be integrating with LLM Models to further evaluate, review and expand their data for better accuracy. Either way, having capabilities for MDM to integrate with LLMs directly or through industry-specific data providers is sure to be a trend.
Augmented MDM
Such buzzwords have been trending in data management for the last 2 years. However, the key here is to embed machine-learning (ML) capabilities to drive the business value and automate the decision-making capabilities by training the MDM tool to auto-resolve duplication, improve data quality, and provide automated recommendations based on ML Models. This is why the term Active & Augmented is used, and they are evolutionary stages of the maturity cycle of MDM tools. Using ML over traditional profiling and suspect resolution is augmenting human behavior. The process of making MDM learn from retraining the ML model after a human or MDM tool reactively or passively resolves data conflicts is what makes MDM active & augmented. Another term Driverless MDM is similar to augmented MDM. Just like the self-driving car can be initiated in self-driving mode where the car is in control to make driving decisions, the driver may take control back when needed. The same notion is applied to automated decisions in MDM. The point is that these are various steps toward the technological maturity of MDM tools & capabilities. As a journey, in establishing a firm MDM practice in organizations, the data-architecture team is responsible for working with their MDM vendors to implement such capabilities as and when they are available.
USE-CASE: The most common use-case of MDM is a consolidation of data through various sources like CRM, ERP, etc. Several entities like Parties exist in a raw form in such sources, with different representations in each system. The goal of these use cases is to consolidate them into a single source of truth using MDM Capabilities like Match & Merge, mostly driven through a built-in Workflow model. To take advantage of automation and scaling, the Matching Workflow should be advanced to be able to connect with ML Models. This will help resolve duplications and manage survivorship, leading toward A Golden Record in MDM Tool. Moreover, the tool should provide basic ModelOps capabilities or should be able to integrate with a ModelOps tool to manage, maintain, train/re-train models their versions, deployment life cycles, etc. This is one aspect that embeds ML models into your MDM workflow to drive business decisions.
The next step is the Artificial Intelligence aspect of the use case. Now, the MDM system has an ML Model deployed to take advantage of business decisions that either human (Data Steward) or ML Model takes to resolve suspects. Based on that model, MDM should be able to provide recommendations where similar decisions can be applied to a list of other records to either group/ ungroup them in existing or new golden clusters. Such proactive recommendations help with understanding the health of your data over a period of time based on how the system or humans are taking decisions through re-trained ML Models. The ultimate decision to resolve the proactive recommendations is completely up to the Data Steward. The key is to know what the system is recommending based on a set of actions or based on the re-trained model that is deployed. Building a recommendations system may require an alternate database choice like Vector Databases for efficiently storing vector embedding on your data using ML Models 3
Continuous Automation in MDM Capabilities
This is as important as the AI/ML capabilities. Often there can be AI/ML embedded in making the MDM processes automated. We discussed one such use case in the previous section where matching workflow takes advantage of AI/ML capabilities, the same way other MDM capabilities can also take advantage of automation driving through AI/ ML.
USE-CASE: Data within the MDM tool is bound to change. The key is that the MDM capabilities must be actively working to continuously automate processes to match the changed state of MDM data with a defined desired state of data, for example: if the MDM inbound jobs have bad data that makes the jobs fail. Based on the desired state configurations, the job handlers should be able to take appropriate action to remove bad data and restart the jobs. The same applies to data-quality processes to ensure the health of data makes the desired state configurations. Another important feature of MDM is classifications or hierarchical representation of data. Changes to the state of data should trigger automated re-alignment of classifications or hierarchies.
If you are familiar with Kubernetes, a quick analogy – when defined in the Pods Manifest of Kubernetes, the job of the Reconciliation Loop is to always ensure that the active state of numbers of Pods matches the desired state.
Similarly, the innovation in MDM tools can be to define your data quality scores, active data classification rules, how your data links are created, and as the delta loads come through your source systems, how your data is actively consolidated. In short, there needs to be a definition manifest where Data Stewards can specify the desired state and MDM tools ensure that the actual state matches the desired state with continuous automation. The definition of manifest can be driven through an approval process, so there is governance & change management on how the desired state is defined.
Knowledge Graphs & Multi-Domain MDM Approach
While knowledge graphs are not a new trend or technology, I have seen many MDM vendors making the capability of adding graph visualization available in the MDM tool. However, I have not seen many organizations making great use of it. Exclusion can be some Pharma companies rely on knowledge graphs to analyze the relationships between critical data elements like substances & medicinal products.
Knowledge Graphs is the future, in fact, in my opinion, it is a bit late to say that it is a future. With the amount of data organizations are managing, alternate databases like graph databases 4 will make implementations of knowledge graphs easier to determine the right data connections to see patterns that are never explored before. It will help drive business more effectively. When we discuss data-driven businesses, it is more in the context of analytics and reports based on the data. However, knowledge graphs will help improve business agility that can have a cascading effect throughout the organization.
Most MDM tools either depend on relational databases or some proprietary semantic model. We know both of these approaches, limit the data-traversing and are not optimized for visualizing data as it is supposed to be visualized. MDM Vendors have already accepted that there needs to be a better way of data representation through nodes, edges, and relationship properties, instead of rows and columns.
USE-CASE: With the need of expanding MDM implementation from a single domain to multi-domains, the connectivity through the graph is most important. An example from my experience with implementing a multi-domain solution for a QSR (Quick Serving Restaurant Chain Company), the B2C Customer Data Model was a collection of customers. The other domains were Location, Products & Menu. With a complete implementation of a multi-domain MDM solution and knowledge graphs, the organization was able to connect key data points to drive digital marketing campaigns. For example, Customers linked within a household, Customers preferred Locations to dine in, Menu Items & Products that are popular at different locations, etc. This is a critical insight to be made available within the MDM system.
In conclusion, there are two more points as trends for the MDM Vendors and MDM Service Providers. In fact, new buyers may appreciate this insight as well.
MDM Deployment Strategies
Even though the tech world is prioritizing its transformation initiative to reduce its on-prem data-center footprint and move to the cloud, the above-mentioned report from Research & Market analysis points out that many MDM implementations are preferred to be on-prem.
I have seen MDM Product companies launch SaaS MDM (single tenant SaaS & multi-tenant SaaS); however, the adoption of any Enterprise SaaS MDM has not been as great as it was expected. The overall data privacy & governance aspect are slowing down the critical enterprise data assets migration to the cloud. The choice of deployment strategy of MDM would continue to be the organization's preferred approach based on their evaluation and fit.
As a consultant, my recommendation is to choose MDM as-a-service (SaaS) or MDM Vendors offering Managed Services to abstract away from infrastructure and application management and rather focus on MDM implementation, which can be a journey by itself.
MDM Adoption through Industry-Specific Solutions
Buyers looking for investment into the MDM tools always appreciate the pre-sales demos and proof points to be contextualized to their industry. This helps expedite the buying process. Similarly, MDM customers would appreciate the ramp-up and acceleration, an MDM Implementation Vendor or Professional Services Providers can provide with packaged MDM Industry solutions that can help with faster value realization and ROI.
In my journey as an MDM Practitioner and Go-To-Market Strategist, my team has launched several such industry-specific packaged solutions for acceleration and successfully implemented MDM with faster time to value realization. We have noticed that it does help with faster adoption and optimizes time-to-deployment. The recommendation to new buyers is to evaluate the ease of implementation for their industry. MDM Vendors should consider this as a standard offering to expedite the time to deployment.
To end, MDM will continue to be the trusted source of golden data for organizations integrating internal systems with MDM data or publishing with an external trading partner. The discussion should help with the journey of adopting new trends in implementing MDM to drive business outcomes and continue to be a strategic platform for business users.