Did you ever think through it? Did you ever try to implement one? Actually I was happy to be part of a team at a large retailer to implement a Data Quality Management system.
In principle it is very simple:
But the problems already start with the very first bubble "Measure the current Data Quality". What is Data Quality? What are the the KPI's (KPI = Key Performance Indicator) to measure Data Quality?
First of all you really have to define what Data Quality means to you. And this really depends on your business processes which use the product data and the requirements those business processes have.
In my case I am pretty sure you could very quickly imagine that the core processes are purchasing and logistics which are the main drivers at the retailer I was working on the Data Quality Management system.
And for gods sake - our team was not the very first one who has to implement a Data Quality Management system for a retailer. We did a little bit research and found very quickly the "Data Quality Framework (DQF)" which is compiled by GS1 to support companies to get ready for GDSN.
The DQF is very helpful to implement a Data Quality Management system for product information if your product information is mainly targeted on supporting business processes in purchasing and logistics. If your requirements go beyond that you have to extend it.
The DQF consists of KPI's, a guide how to implement a Data Quality Management System, and a Self-Assessment procedure.
Most interesting to us were the KPI definitions:
- Overall item accuracy: Ok, this indicator is not that surprising - it is the percentage of items that have correct attributes values for the attributes in scope.
- Generic attribute accuracy: That is already more interesting. This is describing the percentage of items that have correct values for some more generic attributes. GS1 defines the following attributes to be in scope for this KPI:
- GTIN - that is the key for identification of an item therefore it is really key ;-)
- Classification Category Code - As the classsification code is relevant for reporting it is a very important attribute within retail industry
- TradeItemDescription - to me this is really a difficult attribute in retail. At all retailers I have been so far, the buyers always insisted that item descriptions are a means to differentiate from competitors and therefore have to be handcrafted at the retailer or have at least to comply with the retailers rules how the description has to be build. Just as a sidenote - I think that is wrong and the item descriptions in no way drive revenue, but I might be wrong here.
Therefore we decided to leave that attribute out in our reporting. - Net Content - is important for shelf tags and therefore one of the really important informations.
- Dimension and weight accuracy: Depth, Width, Height and Gross Weight are the key attributes here. And those attributes are not only key for distribution centers but also for your transport/route planning and therefore have very strong and immediate impact on logistics.
- Hierarchy accuracy: This is absolutly relevant because in different business processes you are using different units of the same item. E.g. you might order an item on palett level but you stores order it on case level or even each level at your distribution center. If you then do not have the packaging hierarchy correct then you are in serious trouble!
- Active / Orderable: You should not order item units which are not active at your supplier or just not orderable units. This immediately disrupts every automatic, electronic process and therefore has to be avoided.
So with those KPI's you are covering very much all the requirements from business processes in purchasing and logistics.
But the question now is: How to measure accuracy for those attributes?
A retailer has two approaches he can take:
- Compare the data to the data provided by the supplier.
- Do a self-assessment and go to your DC and really measure the physical products to gain that information.
In our project we are doing both. We have implemented a system where we are comparing the supplier data to our data according to the above KPI's on an ongoing basis. As the supplier data provided through the data pool does not cover 100% of the business we are also calculating how much of the business is covered by this report.
On top of this we are doing a self assessment. The reason for this is mainly to figure out what quality the supplier data has.
From our experience a Data Quality Management system based on the GS1 Data Quality Framework is a solid basis manage your MDM program. It gives you the means to document and communicate the progress your MDM program achieves.
---
Update 12.12.2011:
You made it till the end of this post? ;-)
Ok, then I have some more fun stuff for you. I just stumbled over this quite old video from GS1 Netherlands on data quality. But I think it is still to the point and at least it is fun watching and listening:
---
Update 12.12.2011:
You made it till the end of this post? ;-)
Ok, then I have some more fun stuff for you. I just stumbled over this quite old video from GS1 Netherlands on data quality. But I think it is still to the point and at least it is fun watching and listening:
DQF serves as a checklist for further product evaluation. Its main purpose is to act as a tool to keep systematized creation of the product. Without this framework working, the standardization would be worthless.
ReplyDeleteIsaRegistrar.com
Insightful post, thanks for sharing. It's definitely a different time for business, making decisions on big data issues and data management. One does need a good data quality tool for good reporting and business insight, especially when dealing with data from multiple sources, and with product data as you mention here.
ReplyDeleteLinda Boudreau
http://DataLadder.com