Does agentic systems benefit from normalization?

Does agentic systems benefit from normalization?

I’d been building many versions of agentic systems, in various environments (Agent Builder, Copilot Studio, Microsoft Foundry, GitHub Copilot) with various front-end “sockets”, i.e. Microsoft Teams, Microsoft 365 Copilot, Visual Studio Code. MCP and the CLI too.

All this in the pursuit of giving the best experience to the users of these agentic systems.

The reason for all these experimentations, frameworks, tools and platforms is really because of the diversity of questions / prompts / use-cases that the “system” is expected to cater for.

Software development, or shall I call it systems creation, these days can no longer cuddle in the comfort of a defined front-end, responding to the finite possibilities of a “form”.

One instance of a question that I was trying to solve for came from Richard Koh . In our work with partners, the prompt was:

“Top 5 partners in terms of technical capabilities for these 8 specializations [redacted] for each Area in the Asia region”

The other parts of this prompt justify the agentic part of this approach (which I might do another post on!) but the “for each Area” part is what I want to focus on in this post.

Area from a database design point of view is best as a 3NF normalized table. But while I was wrapping a MCP-ed sub-agent around this, I wasn’t sure if this is best for agentic systems, so of course, asked Copilot.

The answer turns out is a tour of the architectural difference between traditional OLTP database and agentic systems. Here’s the conversation verbatim:


Q: Does agentic systems benefit from normalization or is it better not to normalize?

Copilot:

Ah — this is where classical database theory and agentic/LLM system design start to diverge.

Short answer:

Transactional agent systems benefit from normalizationReasoning‑heavy agent systems often work better with intentional denormalization 🚫 Fully normalized schemas are usually NOT optimal for agent orchestration layers


🧠 Traditional DB Goal vs Agentic System Goal

Article content

Normalization helps databases.

Agents operate on:

  • vector retrieval
  • context assembly
  • graph traversal
  • semantic chunking

Which behave very differently from relational algebra.


📌 Why Highly Normalized Schemas Hurt Agents

Let's say you modelled this in perfect 3NF:

Partner

--------

PartnerID

PartnerName

AreaID

 

Area

--------

AreaID

AreaName
         

Now a user asks your agent:

"Which ASEAN partners are Kubernetes specialized?" 

The agent must now:

  1. Understand AreaName = ASEAN
  2. Resolve AreaID
  3. Join Partner → Area
  4. Possibly join Specialization table

For a SQL engine? ✅ trivial

For an LLM‑grounded reasoning agent? 

You’ve just created:

  • retrieval fragmentation
  • multi-hop semantic dependency
  • join planning requirements
  • symbolic mapping tasks inside the prompt space

Which leads to:

  • hallucinated joins
  • missed lookup resolution
  • tool over-calling
  • larger context windows needed
  • grounding loss

📉 The Agent Cost of Normalization

In agentic pipelines:

Article content

Agents reason better over:

self-contained entity descriptions

Instead of:

SQL

{

"PartnerID": 101,

"AreaID": 3

}        

Prefer:

JSON

{

"PartnerName": "ABC",

"Area": "ASEAN",

"Specializations": ["AKS", "AI Platform"]

}        

📌 Agentic Design Pattern Emerging in 2025–2026

Modern agentic architectures now use:

 ✅ Hybrid Modeling

Article content

🏗️ Recommended Pattern for Agent‑Ready Data

1️⃣ Keep Source-of-Truth Normalized

For:

  • updates
  • referential integrity
  • governance

Example, in Fabric DataWarehouse / Azure SQL DB, keep the Partner table and Area table separate.


2️⃣ Create an Agent Serving Layer

Materialize into:

 
PartnerAgentView

-----------------

PartnerName

AreaName        

Or even, persist as:

  • JSON documents
  • Parquet entity records
  • Graph node attributes
  • Flattened semantic text blocks


3️⃣ Feed RAG / Toolformer Layer

Agents perform better when:

  • joins are precomputed
  • domains are text-resolved
  • lookup mappings are pre-expanded
  • entity attributes are co-located


🔥 Actual Rule of Thumb Now Used

Normalize for correctness

Denormalize for cognition

Vectorize for reasoning 


✳️ The Mental Shift

Relational modeling optimizes consistency under mutation

Agentic modeling optimizes understanding under uncertainty

Which are fundamentally different system objectives.


Bottom Line and Example

In short, when creating relational data layers for AI, we can think about designing the tables to make it easier for AI to understand.

Table (schema) design makes a difference.

For example, if I went with the full 3NF design shown above, with Partner and Area tables separate, GPT-4.1 was not able intuit this properly…

Article content

Naturally, a more costly model tried harder and was able to produce results:

Article content

Claude Sonnet 4.6 is 3x the cost of GPT-4.1 on GitHub Copilot though, which perfectly exemplifies the case for optimizing for AI’s understanding, i.e. to produce more accurate results with lower cost. If we have area as text within the Partners table, no doubt GPT-4.1 would had been able to find it in the first pass.

Summary (and what happens in practice)

This has been an interesting diversion, exploration and thinking session.

So, what happened to what I was trying to solve for, i.e. the modelling around the partner ecosystem?

Well, my eventual schema is 3NF 😊

Why?

Because, well, I already cut the data into that shape and imported it into my database already.

I reckon my time and the amount of time my agent is going to be used with Claude Sonnet 4.6 or better models in tow is still better optimized that way…

And all this comes around to my favorite principle of respecting the diversity of situations that are there in the world.

There’s always the ideal architecture or perfect whatever, but for what context?

 

 

 

 

 

 

To view or add a comment, sign in

More articles by Vito Chin

Others also viewed

Explore content categories