Data Quality Is Not a Checkbox - Building a Continuous Program for Enterprise AI
I was brought in to take over a data quality program that was, by any honest assessment, failing. The program had been running for over a year. There were dashboards, status reports, and weekly meetings. But the data was still bad, the business did not trust it, and the team was demoralized.
What I found was a pattern I have seen multiple times across industries: data quality treated as a checkbox rather than a continuous, embedded discipline. If your organization is running AI models in production — or planning to — this distinction is existential. One-time profiling and quarterly clean-ups do not survive real-world drift. What works is a continuous DQ program embedded in pipelines, owned by domains, and measured in business outcomes.
Why Data Quality Is Existential in the AI Era
Data quality has always mattered. But in the AI era, the stakes have fundamentally changed. When AI agents, chatbots, and conversational AI systems query your data platform through ontology and semantic layers, every quality defect becomes a trust defect — visible to the end user in real time.
Consider what happens when a business user asks a natural language question — "What is our lease exposure in the Northeast for the next 90 days?" — and the conversational AI retrieves data from a Gold layer with duplicate properties, stale lease records, or inconsistent regional hierarchies. The answer is not just wrong. It is confidently wrong. The AI does not caveat its response with "by the way, the underlying data has quality issues." It presents bad data as truth. That is how hallucinations are born — not from the model, but from the data.
This is the critical insight most organizations miss: the fastest way to reduce AI hallucinations in enterprise settings is not better prompts or larger models. It is better data. Specifically: governed, quality-gated, DQ-certified data products that the AI can trust. When a knowledge graph serves semantically clean, freshness-guaranteed, accuracy-validated data to a conversational AI layer, the model does not need to guess or interpolate. It retrieves facts. The hallucination rate drops not because the AI improved, but because the data improved.
Every quality defect that reaches Gold is now a potential hallucination in a board-level conversation, a customer-facing chatbot, or an agent-driven operational decision. That is why data quality is no longer a hygiene exercise. It is the foundation of AI trust.
What I Found: The Anatomy of a Failing DQ Program
The problems were systemic, not technical. The tools were adequate — the approach was not.
The Turnaround: Embedded, Continuous, Business-Aligned
The turnaround required changes at every level: process, technology, and culture.
Embedding Quality in Pipelines
The first and most impactful change was moving quality checks from retrospective profiling to embedded pipeline checkpoints. Using DQ Platform, we built quality gates directly into the transformation pipelines — between Bronze and Silver, and between Silver and Gold. Checks run automatically with every pipeline execution. Failures trigger circuit breakers: hard gates block data promotion, soft gates log warnings and continue.
This was transformative. Instead of discovering bad data after the fact, we caught issues at the point of entry. Business users stopped receiving bad data because bad data no longer made it to Gold.
The gate structure was deliberate: completeness and timeliness checks at Bronze (did the data arrive? is it fresh?); Validity, accuracy and consistency checks at Silver (do values fall within expected ranges or values? do cross-field relationships hold?); consumer-contract checks at Gold (does the output match the schema, freshness, and completeness guarantees in the data contract?).
Business-Driven Rule Definition
We replaced the technical-only rule definition process with business-aligned workshops. For each domain, we sat with business stakeholders and asked: "What does good data look like for your use cases? What breaks your reports? What decisions depend on which fields?"
This sounds obvious, but it was a fundamental shift. Quality rules went from "column X must not be null" to "every active customer record must have a valid email address because our marketing campaigns depend on it." The second formulation ties quality to business outcomes and creates accountability.
Domain-Level Quality Scorecards
We built composite quality scores at three levels: individual dataset, business domain, and organization. Each score aggregates completeness, Validity, accuracy, consistency, timeliness and uniqueness into a single metric that executives can track over time. Trend lines showed improvement — which built confidence — and highlighted domains that needed additional attention.
Recommended by LinkedIn
These scorecards became a governance instrument. Domain leads were accountable for their quality scores. The bi-weekly data governance council reviewed scores, identified blockers, and allocated resources to the worst-performing domains. We published RAG (red/amber/green) dashboards per promotion gate to focus attention where it mattered most.
Ownership and SLAs
Each data product was assigned an accountable owner with response targets for DQ failures. This was not a technical assignment — it was a business accountability. When a quality issue occurred, the incident was routed to the owner by default with samples and an AI-generated analysis report, not to a generic IT queue.
AI Agents for DQ Acceleration
This is where the turnaround went from incremental improvement to step-function acceleration. We built AI agents that automated the most time-consuming aspects of data quality operations:
These agents did not replace human judgment. They augmented it. Data stewards spent less time on detective work and more time on remediation and prevention. We kept autonomy levels explicit: advise (suggest rules for human approval), automate (execute within guardrails with human review), and autonomous (self-heal low-risk issues like re-running a failed ingestion). Human-in-the-loop remained essential for exceptions and policy decisions.
Scorecards Leaders Can Act On
The scorecards were designed for executive consumption, not technical deep-dives. Each scorecard showed: product-level and domain-level freshness, failure rate by check type, business impact (orders affected, dollars at risk), and trend over time. Red/amber/green by promotion gate focused attention where it mattered.
The most powerful feature was the trend line. When a domain’s quality score improved from 72% to 94% over three months, it was visible and celebrated. When a domain stalled at 81%, it was visible and addressed. Visibility created accountability, and accountability created progress.
Remediations Stewards and Data Owners Can Act On
Another critical feature in the whole DQ Process is the remediation flow. We designed the incident lifecycle to be specific: auto-ticket on failure, route to the right owner, AI-performed diagnostics, findings and recommendations, attach a runbook with remediation steps, and measure MTTD (mean time to detect) and MTTR (mean time to resolve). Over time, MTTD dropped from days to minutes as embedded checks caught issues earlier, and MTTR improved as runbooks became more precise.
Lessons from the Turnaround
Four Moves You Can Make Tomorrow
Looking Ahead: DQ as the Foundation of AI Trust
Data quality becomes durable when it becomes routine. Embed it in code, make it visible, assign ownership, and let AI accelerate the boring parts. That is how trust compounds — and how AI stays in production.
But looking ahead, the importance of continuous data quality will only intensify. As enterprises deploy conversational AI, natural language query interfaces, and autonomous AI agents that interact with data on behalf of business users, every data quality failure becomes immediately visible. There is no analyst in between to catch the error, no dashboard designer to add a caveat, no data engineer to explain the anomaly. The AI surfaces whatever it finds — and if what it finds is incomplete, stale, duplicated, or semantically inconsistent, the result is a hallucination that erodes trust instantly.
The organizations that invest in continuous DQ programs today — with embedded pipeline gates, domain ownership, DQ certification badges, and AI-accelerated remediation — are building the trust infrastructure that conversational AI and agentic systems require. The ones that treat quality as a quarterly profiling exercise will spend years wondering why their AI investments are not delivering value.
The AI agent approach to data quality is, I believe, the future of the discipline — and a core capability I am building into the next generation of data platforms. In my next article, I will discuss the modern data governance stack — catalog, lineage, marketplace, and beyond — and how governance becomes a value enabler rather than a compliance burden.
#DataQuality #DataGovernance #EnterpriseAI #CDO #DataPlatform #AIAgents #Turnaround #DataStrategy
Strong point hallucinations are often a data quality issue not a model flaw. If the Gold layer is not clean AI just amplifies bad data. Embedding DQ into pipelines and certifying trusted data is the real fix.
Thank you for sharing. I would like to differ. We worked on solidifying data pipeline and building a great enterprise data management and then started an AI chatbot - based on LLMs, and it does hallucinate! Despite perfect data. AIs dont hallucinate if we fix the rest is a myth! The best models could go to 80-90% accuracy if I have to be honest with correct prompt engineering . Saying this after first handing testing last 6 months
Great article, I will get some of these ideas to our organization. Thanks Suresh!
Nice Suresh! Appreciate all you do at #Anblicks