Secrets of Operational MDM – Part 3 : Consuming Systems

In my previous posts Secrets of Operational MDM – Part 1 : Choosing System Behaviors and Secrets of Operational MDM – Part 2 : Contributing Systems we categorized systems connected to our MDM environment as Consumers, Managers and/or Creators of master data. We also looked at what how these system should contribute to an MDM environment. Now let’s look at these systems should consume data from MDM.

For consumption there is another very important system characteristic to consider. That is if the consuming system can handle merging of records.  While ability to group and “merge” and “unmerge” records is one of the key benefits of MDM it is not necessarily a strength or need of operational systems.  The real power an MDM tool is supporting integration on either the merged (consolidated) or the unmerged (unconsolidated) record.

So two additional categories to introduce are merge aware and non-merge aware systems.

A merge aware system can be informed that multiple entities it contains are actually duplicates and will correspondingly merge those entities in its own operational store.  Be cautious here, errors in merge are possible;  so it is important to understand and design for the unmerge case as well.

For many systems it is not practical nor appropriate to be merge aware.  Operational systems manage their own independent transactional data.  If a duplicate mastered entity ends up being created and some transactions are recorded against both entities. It may be cumbersome and even violate some business rules to force these to entities merge. So it can it is vital to support the ability to find their own independent records in many cases.

To best support both merge aware and non-merge aware integration it is best to have the following IDs

    • Consolidated ID that uniquely identifies any existing merged entity
      • Note a given Consolidated ID will disappear if it is merged and re-appear if unmegred
    • Global ID” that uniquely identifies every record that is loaded in the MDM
      • Note a Global ID is managed and independent of the contributing source and should never change
    • Source Key is a source specific key uniquely identifies records contributed to the MDM

Consumers

Merge Aware: Systems that only consume data from and can accommodate merging for mastered entities simply need to contain a Consolidated ID for each master entity.  And also register to receive notification of any merge / unmerge.  To cover the unmerge case this usually requires keeping the Global ID Identifiers as well.

Non-Merge Aware: In this case the consuming system still needs to get an ID but since it will not be updated in the case of a merge so utilizing a Global ID is the best practice.

Managers

Merge Aware: Systems that manage data from and can accommodate merging for mastered entities simply need to contain Consolidated ID and its Source Key for the record it contributed.  Also these system will need to receive any notification of merges / unmerges.  Since managers will have their own IDs for their contributed records these can be used to resolve unmerges from MDM.

Non-Merge Aware: In this case the managing system could just use its own Source Key.  However it is a good idea to use a Global ID which would be consistent across sources.  This is particularly helpful when data from these systems is brought into a data lake / warehouse.

Creators

In all cases any system tat can create master data should first search the MDM environment prior to actually creating a record.  This “search before create” simply reduces the number of potential duplicates and the amount of manual entry. This is true of merge aware and non-merge aware systems

Other than the addition of search before create creators should consume data form MDM environments the same way that managing system do.

One of the major and costly errors I’ve seen in implementations is assuming that systems should always want to integrate at the consolidated level.  If this ends up being desired make certain to validate update and unmerge behavior.

In terms of making consuming application merge aware or non-merge aware the guiding principle needs to be the operational benefit. Getting a insight based on merging master data entities  is very important.  But a great MDM environment goes beyond this and also allows systems to operationally efficient and work on unconsolidated data as desired.

Secrets of Operational MDM – Part 2 : Contributing Systems

In my previous post Secrets of Operational MDM – Part 1 : Choosing System Behaviors, we categorized systems connected to our MDM environment as Consumers, Managers and Creators of master data. We know that systems often have different sets of behaviors for different entities, and can have multiple behaviors for a single master data entity. For example one system may consume and manage customer data but create and manage product data. These behaviors will guide the best way to integrate these systems. In particular, we will look how each system can best contribute master data into the MDM environment—we’re not looking at the data model, but rather understanding which systems will contribute records to be mastered.

Consumers – Systems that only consume master data without the ability to create or edit master data, by design, do not contribute any records to your MDM tool. Since these systems do not provide any new master data information, they would not be used to contribute any master data entities. Bringing data into MDM that is consumed by these systems would just add to record volumes without contributing any new information. These system often do have transactional data that is important, but that data can be integrated outside the MDM environment.

Managers – Some systems manage master data but do not actually create new master data entities. They will also consume master data, since they’d have nothing to manage otherwise :). From these systems you should bring into MDM all master data entities actually edited or augmented by these systems. If possible, avoid bringing in records that the managing system has simply consumed and not touched. This, incidentally, is an example of a system exhibiting multiple behaviors (consumption and management) for a single entity type.

Creators – Finally, there are systems that will actually create new master data entities. Usually these will also manage/edit entities, and will sometimes consume entities. As with managing systems, master data entities that are created or updated by these systems should be added to your MDM tool—but any entities simply consumed should not be contributed. For the creation of new entities in these systems, adding a search before create capability will avoid the creation of unnecessary duplicate records.

One additional note: editing data takes effort, so it is very unusual for operational systems to simply make “bad” edits to master data entities; data is typically only edited to support some specific need. Of course sometimes these needs are not shared, and other systems may see this as “bad” data. However, the ability to see the all operational versions of master data is almost always helpful. This if often truest for the systems that have edited/impacted master data the most, even if data from those systems aren’t “trusted”. Configuring survivorship/trust rules in MDM allows this data from managers and creators to be brought into MDM in order to get the most value from the data while preventing any undesired changes to the trusted master data.

The power of MDM is that it allows each operational system to keep data fit for a specific purpose, while enabling sharing the same entities across different systems. To do this, you really want each and every version of you master data entity to be available. The best news is that if you get this right, it will also improve business intelligence at the same time. Don’t forget insights are great, but actions are better. With proper configuration of operational MDM insightful actions become possible.

The next and final installment of Secrets of Operational MDM will look specifically at how operational systems should consume data from your MDM environment.