Delivering Project & Product Management as a Service

Flowchart

Data quality revisited

When I was at school there was lot’s of hype concerning the TQM acronym. “Total Quality Management” – It seemed that quality improvement was the way to get more revenues because the less failures you had in your product line the less likely customers would stop buying your stuff and start buying Japanese products. It was in the days were American cars were slow and under engineered and Japanese cars were less likely to leave you stranded in the middle of a junction due to some electrical malfunction.

Lots of effort and money went into Six sigma programs, Quality circles and relearning from the Japanese the lessons they originally got from W. E. Deming. Deming was a professor of statistics, who was brought to Japan as part of Gen. Douglas MacArthur post WWII initiative to rebuild Japan’s economy and he was the one who taught Japanese industrialists SPC (Statistical Process Control) that later on the Americans relearned from them again.

Somehow, things have changed. Now we live in a universe where quality does not seem to play such a relevant part. Short TTM (Time To Market) rules the VCs point of view, where you get the MVP (Minimal Viable Product) as fast as you can to the customers in order to get market feedback and make the required changes. Customers in turn are not heavily invested in an application since pricing models are built according to usage, while most got used to marginal quality in applications to begin with. Hardware prices are dropping exponentially, so basically equipment investment is disposable. As for Statistical Process Control, who cares about sampling when one has Big Data, Deep-learning and Hadoop like technologies to process all of it?

Well, it’s all about the ratio of data to measurement capacity. Back in the fifties when engineers were using slide rulers for calculations, it made sense to sample. Now that data is being gathered in an increasing rate and Moor’s law is diminishing, we are soon to get back to the old tactics of sifting through the piles.

The basics

Data quality is by definition target oriented. You invest in quality only so far as to reduce negative effects, and up to a certain cost due to diminishing returns.

That means that you have to work from the target backwards. For example if your target is invoicing and you get 10% returned mail. You might invest a considerable effort in correcting addresses when a large debt is involved (dunning up the the cost of debth) and only a slight effort if it’s ‘just” regulatory requirement.

Logical entities or subsystems

Once we established that the DQ business is target oriented, we have to define the target. A target can be loosely defined in the same was a product is defined. A desired outcome that can be further segmented to sub systems like in a PBS. For example if your product is a CRM product you have to consider sub systems like contact-center automation, Marketing campaigns, orders dispatching etc.

The problem with this type of segmentation is the fact that it’s more process related. So you might follow a process that’s reduces quality of street addresses wherever the application allow the user to enter street addresses but if you want to measure the effect  of the quality of listed street addresses, you really have to look at address logical entity as a target and move from there.

So let’s target the logical entities, like Customers, Inventory, Files, etc. In a way this really is a reducing the complexity of the analysis since one is dealing with outcomes, and at this stage not with the complex processes that produced those outcomes.

Profiling

Profiling is the process of analyzing the “probable” problems with the logical entities. A customer with no address might be a problem as well as a product that is not within specification. This is where sampling comes in handy; you may not need it if your data is only in the millions, but if your client base is a social network, your analysis cycle will take forever if you work on the total population. The outcome of the profiling process is a list of problems or quality events that are to be measured and monitored.

Profiling should be done on an ongoing basis and when possible portrayed on a dashboard signaling quality trends. This exactly the place where the old SPC control charts are usable, to gain insight into matrices like, when the process of client acquisition is stable in terms of data, or is it performing with excessive variation.

When dealing with logical entities one has to remember that they are the outcomes of physical / logical process that we so conveniently ignored. A customer record may be created and updated from several sources, and Inventory record is influenced by both production and sales. Therefore, this is the time to delve into the error sources and fault analysis.

Fault tree analysis

The years after WWII were prolific in terms of US military budget and engineering achievements. The Nautilus SSN project brought to life Project Management as a discipline. And The Minuteman intercontinental ballistic missile created Fault Tree Analysis. FTA is a hierarchical description of a system, breaking down its failure modes into various sub-categories connected with logical gates. Dealing with Data Quality we have to remind ourselves that we are measuring logical entities and not sub-systems. Our root fault is a lacking customer or inventory record and that fault may be the result of insufficient error checking in data entry OR error in transaction mechanism AND no corrective action taken by operators.

FTA can be used with probabilities as a way to assess the reliability of data and enhance its quality by adding logical steps that reduce the fault probability. Or it can be used to establish data correction mechanism by dealing with each of the leafs in the tree separately.

Data Correction

After establishing an FTA over the logical entities and traveling the tree to its leaves, all that is left is enumerating the various events and providing a correction algorithm and cost function. For example correcting a faulty address can be automatically checked with the post and a very low cost, while various errors in charging customer’s account can only be negotiated by reading the contract account by human. The decision whether to invest the effort is a function f(error correction cost, error frequency, error cost, contribution to the total quality of the logical entity)

Endnote

The cost of modeling Data Quality process for a business is usually considered prohibitive because it is taken as a one time effort. So most Data Quality projects are performed as part of Data migration project. If taken as an on-going effort to increase data quality, those costs are mitigated by the increase in revenues due to better data and are spread over a larger life cycle period of the organization or the product.