Is MDM the answer to bad data quality? (Discussion topic 1 for upcoming TDWI DQ event)
Is MDM the answer to bad data quality?On the 28th of february2011 TDWI in Finland is organizing a Data Quality seminar with yours truly as the speaker (event link: http://www.tdwi.fi/tapahtumia-events/). Apart from the main presentation we will also have two round table discussions. The idea is to spark interesting discussions based on your experience and knowledge around your data centric business challenges. Please feel free to start the discussion by commenting this post, based on the question above!
Best Regards
Dario Bezzina (http://www.affecto.com & http://www.betterdataquality.com)
There is currently a strong trend to invest heavily in the area of MDM and these projects have a tendency to become very large and heavy systems investment and almost begin to resemble the huge SAP project as we see in the market. But will this solve the data quality problems? I would say that it depends on how the problem is attacked and how to focus. The good news is that it will focus on the fact that data quality is an issue that has impact on a companys profitability so that it will be on the agenda and that someone gets ownership over data quality. So far everything is ok. The only question is whether an effective system environment solves all problems. In my opinion there are too many MDM projects fail when they rely on system performance and do not realize that you have to wash their data against quality-assured reference databases for Business, product and service information. Otherwise it will be like sorting dirty laundry in the laundry room and believe that it is clean, although it has not been run in the washing machine.
No doubt about that we need MDM in order to fight bad data quality.
An ever recurring subject in the data quality is whether we can establish a single version of the truth.
The most prominent examples is whether an enterprise can implement and maintain a single version of the truth about business partners being customers, prospects, suppliers and so on and the same goes for product master data.
In the quest for establishing that (fully reachable or not) single version of the truth we use entity resolution techniques as data matching and we are exploiting ever increasing sources of external reference data.
However I am often met with the challenge that despite what is possible in aiming for that (fully reachable or not) single version of the truth, I am often limited by the practical possibilities for storing it.
We need good MDM hubs with flexible models for that.
From a conceptual standpoint implementing a comprehensive MDM program should address a number of data quality issues, especially as it requires an investment in assessing the structural and semantic differences that exist within the organization and that seem to be the root cause of a lot of inconsistency and inaccuracy when comparing data sets in different lines of business. However, a lot of faith is placed in initiating small MDM projects that essentially build new (and separate) data silos to be used as a “source of truth.” These data sets are just as susceptible to errors, and in fact not addressing some of those same core issues (structural and semantic inconsistency) can lead to a false sense of security in trusting these master data silos.
So I guess my thoughts are that attempting to solve data quality problems by attempting to justify an MDM task may not achieve what you are looking for in terms of improved data usability.