Adaptive Data Quality with Cognitive Computing

Ajay Vaidya

Published Feb 3, 2017

There is so much to talk about Data Quality and has already been researched thoroughly by many professional practitioner globally. The key levers in Data Quality space are Business Benefits, KPIs, Quality Metrics and measurement mechanism. These levers triggers the necessary actions within businesses to improve the Data Quality. There are tons of various practices exists to improve the Data Quality based on type of Data in the specific context of Business.

Primarily Data Quality focus for Industry has been to correct the quality of data that is already been admitted into the boundary of Businesses and also to block and correct the data at the entry point itself. Data Governance goes one step beyond and addresses the core issue of existence of bad data quality itself. Various data quality KPIs with Data Governance policies attempt to control every possible things that could go wrong and result into bad data quality. These attempts have certainly shown the significant results to improve the Data Quality at acceptable level.

However it is not yet perfect. Have we ever heard that Data Quality has become perfect after implementing all the Data Quality and Data Governance solutions? There is a headroom available to further improve the constant state of data quality matters.

Human aspect plays critical role in addressing Data Quality issue. Data Governance addresses some of these things but limited to as what Humana should do and should not. However the correctness of data has major portion that is purely based on individual human judgment and perception. No traditional approaches of data quality has either control on these aspect nor there any mechanism to enable that. For illustration purpose, a sales manager has to feed sales forecast numbers every quarter. A sales manager feed in details based on his own “judgment”. However it turns out that numbers provided are always 20% shorter than actual outcome of every quarter. Whereas some other optimistic sales manager would always provide sales numbers 25% higher than actual outcome. This falls under the “Data Accuracy” regime. Another example, in a bank, each customer is rated and categorized in specific category based on risk and LTV. Certain products and rates are based on customer category like interest rate for higher bracket customers are lower. Customer service representatives, due to loan target pressure, could potentially bump up the customer bracket and sell loan with lower interest rate. This results in defaulting “Data Accuracy” parameter. A CRM application is supposed to feed regular updates on customer demographic information to Master Data Management system. However a bad data captured at POS could reflect in CRM and hence further into Customer Master Data as part of regular weekly feed. One could simply rely on CRM owners to correct the demographic data by providing them feedback after analyzing data quality in MDM. However post this corrective mechanism, it is not guaranteed that CRM would not feed again wrong demographic data during subsequent feed cycles. Such behavior of CRM system (driven and maintained by Human) fall under noncompliance to “Data Accuracy”. A “product” size and weight information sourced from Logistics and Merchandising units. However data ingested by merchandising unit has been always wrong in first place. It is discovered that data becomes accurate only at a very late in the value chain by logistics unit. The data ingestion done by merchandising unit at first place falls under noncompliance to “Data Accuracy” aspect. There are many such examples which are based on judgmental (which is subjective) capability of entry point elements either human or machine.

In these examples, solution is not to change human behavior as it is practically impossible to do so. Rather one could get good results if Data Quality framework is “Adaptive” enough to adjust to human behavior. The human behavior in the example of sales forecast numbers could be unintentional or could be intentional. In some cases, in various other sensitive data elements, one cannot undermine the possibility of malicious intentions. Traditional Data Quality approaches does not have ability to address the human aspect. Human training, Data Governance policies and Data Quality checks can do so much but does not ensure human behavior.

Adaptive Data Quality along with Cognitive Computing would address these challenges by going one step further in Data Quality solutions. Key salient aspects of Adaptive Data Quality :

Define Reliability Index (DRI) for each entry point of data. It could be a Human or another IT system including IOT devices, various applications or even user front end applications. Human behavior challenge is applicable for applications as well because of the simple fact that these applications are maintained again by human. The DRI measure the quality of the data provided over the period of time. If the data provided was consistently deviated from “correct” data, DRI would go down over the period of time. On the other side, more closely the provided data with “accurate” data, the higher is the DRI. In other way DRI reflects the reliability of individual human with respect to specific data element or group of data element. (Note : In some case DRI definition may not be possible where feedback mechanism cannot be established)
Behavioral Data Adjustment (BDA) is a direct reflection of DRI. It is a statistical data adjustment in order to move closer to “correct” data. For instance, in the sales forecast number example, if the numbers are consistently lower by 20%, sales numbers can be increased by system accordingly.

Cognitive computing to build DRI index. In its simplistic way, it could be simply based on the statistical internal information like past actual sales figures. Or in advance form, it could be coupled with social information to identify detailed patterns. Later would not applicable to end points in the form of Application other than Human.

To view or add a comment, sign in

Adaptive Data Quality with Cognitive Computing

Ajay Vaidya

More articles by Ajay Vaidya

Others also viewed

Harnessing the Power of Cloud-Based Services for Effective Monitoring and Evaluation

Where Should Enterprises Start Consolidation?

Knowledge Layer: The Next Frontier to Automate Businesses with AI

Why Retrieval-Augmented Generation (RAG) is Best for Enterprise Technology Use Cases

The Model Isn't your Bottleneck in Enterprise Deployment, It’s Your Data

Part 2 | Data Quality Still Makes or Breaks AI

Measuring Data Quality: Metrics and KPIs

Stay Informed: QKS Insider

Your Data, Supercharged : How AI is Revolutionizing MDM, Data Architecture, and Data Modelling

The Central Data Strategy for Empowering Agentic AI in Organizational Business Processes

Explore content categories

More articles by Ajay Vaidya

Trust with trustless - Blockchain

Data Quality is not about the “Data” alone

Do not Trust AI?

Neural networks are like human learning riding a bicycle

Is Machine Learning same as Machine driven Automation?

MDM Bridging the World

Object Orientation default for Digital