Where ‘big data’ appears to be the skeleton key that will unlock everything and all you want to know about your business, there is more than meets the eye when it comes to understanding your data. Yes, clean data will unlock incredible value for your enterprise; inaccurate records, on the other hand, are a significant burden on your team’s productivity.
This is why we all seek the “Golden Record”.
The Golden Record is the ultimate prize in the data world. A fundamental concept within Master Data Management (MDM) defined as the single source of truth; one data point that captures all the necessary information we need to know about a member, a resource, or an item in our catalogue – assumed to be 100% accurate.
Its power is undeniable. However, where we have multiple databases, working out how to achieve such perfection is hard to ascertain. As such, we must first understand the benefits of a golden record.
Why is the Golden Record Important?
Duplicated records – the bane of the Master Data Management world.
To explain the concept of duplicated records, we will examine an example to help simplify the subject. To highlight the complexity of data duplication we will use the example of a typical customer of a retailer – ‘John Smith’.
A shop may record the customer as ‘J Smith’ when delivering an e-receipt and the company’s in-store CRM record will show one customer with ‘J Smith’ in the name column. Separately ‘John Smith’ registers an online account with the same company’s ecommerce website. Rather than signing up as ‘J Smith’, John is more likely to fill his name in as ‘John Smith’.
When merging these two databases, ‘J Smith’ and ‘John Smith’ will, without intervention, become two discreet records – a duplication of records will occur. In reality, we know this is one person and our systems should reflect this for accurate reporting, and for more accurate marketing and sales records.
When building databases from disparate sources, we often run into the issue of duplication. Whether resulting from incomplete entries, changes that occur over time or some other reason, this is a significant issue for any enterprise that relies on vast volumes of information.
As you may imagine, if we were to expand the ‘John Smith’ example to include hundreds of thousands of names, the overhead of duplication becomes exponentially worse, with every process draining an increasing volume of resource.
If we manage to compile a single entry, however – the “Golden Record” – every process becomes infinitely more efficient, and we can begin to leverage the data at our fingertips.
How to Build the Golden Record?
The complexity of implementing a Master Data Management solution stems from defining the workflow that will connect our disparate data sets.
First, we have to identify every data source that feeds into the dataset. Then, we must consider which fields we find to be the most reliable depending on their source. Finally, we must define the criteria that will determine when the data from one source should overwrite conflicting data from a secondary source in our MDM system.
How to Merge and Match Records?
The critical question you will face in any MDM solution is how to merge and match apparently duplicate records.
With cases of duplication, there are two seemingly similar entries, so what is our process to create the single golden record? If there is crossover but cases of specific difference – such as a postcode change – it is not as simple as creating an automatic case to match and merge.
In such instances, the system must review the source of each field. If the first source is deemed more reliable for the postcode, whereas the second field is more reliable for the name and phone number, then define rules that specify the system to follow this approach.
Most MDM solutions offer effective merge functionality. So, you could define the above criteria for the system to review records and, where necessary, carry out the appropriate merge process.
Inevitably, problems still arise with data quality. Particularly if we are lacking a reliable system for specific records; date of birth, for example. In these cases, having a workflow manager toolkit can help.
The toolkit will assign inconsistent records to a data steward for human review, so they can either follow up discrepancies or use past experiences to inform their decision.
Further rules can be put in place to manage the final merge of the revised information, meaning we preserve the overall integrity of our Golden Records.
For more information on managing your data, call us on +44 (0) 3333 1111 00 or email us at email@example.com.