SQL + JSON: Expanding the Same System
What changes when you add structure—and what stays the same
A simple starting point
We started with a straightforward question:
Why do certain wine regions consistently produce higher-rated wines?
From the wine reviews dataset, it’s easy to get part of the answer. You group by region, average the ratings, and sort the results.
That gives you a clear view of what is happening. But it doesn’t explain much beyond that.
There’s no real sense of:
So the next step was to add context to the same dataset.
Bringing in AVA data
That’s where the AVA (American Viticultural Area) data comes in.
These are officially defined wine regions designed to capture the differences that matter in wine—climate, soil, elevation, and geography.
Instead of treating “Napa Valley” as just a label, you start to see it as a region with structure behind it.
The dataset comes as GeoJSON, so we stored it as-is as JSON in the same database.
At that point, we’re effectively working with:
in one system.
What changed
The system didn’t change. The database didn’t change.
The query model didn’t fundamentally change.
What changed was the set of questions we could ask.
The model
We brought together:
Everything runs through the same query path.
What we looked at
We walked through a small set of questions to see how this behaves in practice.
Coverage
The first thing that showed up was coverage.
Not all wine reviews map cleanly to AVA regions.
That becomes the baseline. If coverage is low, any joined analysis becomes harder to interpret—not because the query is wrong, but because the data is incomplete.
Adding context
Once you have a reasonable mapping, the output starts to look different.
Instead of just ranking regions by score, you can see:
The output starts to move beyond a simple ranking and toward something you can use for decisions.
JSON in the query
From there, JSON becomes part of the query itself.
Attributes that don’t naturally fit into relational tables can now be used directly in analysis.
The point isn’t the attribute—it’s that semi-structured data is now part of the same analytical path.
Structure and behavior
Looking at AVA age adds another dimension.
You start to see whether older regions behave differently from newer ones across rating, price, and value.
This shifts the conversation from what performs best to how different regions behave.
The edge cases
Then there are the outliers.
High-rated wines that don’t map to AVAs at all.
They highlight where:
In practice, they become the next set of problems to solve.
What this suggests
After running through a few of these, a pattern starts to emerge.
SQL gives you a clear view of what happened.
JSON adds the structure around it.
And how much you can trust the combined result depends on how well those two are connected.
The representation expanded. The system stayed the same.
Architectural takeaway
This isn’t a new system so much as an extension of the one you already have.
Structured tables and JSON documents are part of the same query path, each contributing a different layer of meaning.
As representations expand, the key is maintaining a single, governed path from data to decision.
Why this works
This only works when the system doesn’t force a choice between OLTP, OLAP, and JSON.
When those capabilities live together, the boundary shifts from infrastructure to query design.
It’s not SQL versus NoSQL.
It’s structured and semi-structured data working together in the same system.
Closing thought
As systems evolve, you keep adding layers—structure, meaning, context.
The system itself doesn’t get simpler.
What matters is how well those layers work together.
Question
In your environment:
Are structured and semi-structured data part of the same query path, or are they still handled separately?
This is part of an ongoing series on data systems and AI. Full version and future posts are available on substack.