The Taste Gap
The most productive users of AI are the ones who are best at rejecting its output.
The most underrated skill in the age of AI is the word “no.”
Not a fearful no. Not a reflexive no. A precise, informed, articulate no: one that can explain what is wrong with a perfectly competent piece of work, and why. This is the skill that separates someone using AI from someone wielding it. And it turns out that the fifteen years we spent developing an instinct for quality were not a detour. They were the main road.
In “The Spec Is the Product” we covered specification: the ability to describe what good looks like before anything gets built. But even the best specification does not save us from the moment when the output lands, polished and plausible, and something about it is wrong. Something we cannot quite name yet. Something that requires a different faculty entirely: judgment.
The Problem with a Hundred Options
AI changed the economics of generation. It used to take a week to produce a single architecture proposal, a single marketing brief, a single design concept. Now we get a dozen in an afternoon. This is wonderful. It is also, for anyone paying attention, a trap.
The trap is this: when generation was expensive, evaluation was implicit. We judged as we built. The architect developing a single proposal over five days was evaluating thousands of micro-decisions along the way. The slowness of creation was not a bug. It was the environment in which taste developed. Constraint forced discernment.
Remove the constraint, and we get volume without discernment. A hundred options, all competent, none right. The bottleneck has shifted from “can we produce this?” to “can we tell whether this is any good?”
This is the taste gap. And it is widening.
What Taste Actually Is
Taste, in this context, is not aesthetic preference. It is the accumulated ability to evaluate output against standards we may not be able to fully articulate.
A senior engineer reads a code review and feels something tighten. The architecture is valid. The tests pass. But something is wrong. After a moment, the engineer names it: this solution couples two systems that will need to evolve independently. The engineer knows this not because the engineer ran an analysis but because the engineer has seen this exact shape of mistake before, in three different codebases, at three different companies. That is taste. Pattern recognition operating at the speed of instinct, informed by experience that cannot be shortcut.
AI has none of it. It has capability without experience, fluency without understanding. It produces output that sounds authoritative precisely because it has no uncertainty. And that confidence is what makes evaluation so critical: the gap between “sounds right” and “is right” has never been wider.
Do Not Outsource the Thinking
Dexter Horthy, who has worked with thousands of engineering teams on AI coding workflows, stood on stage recently and said something that should echo through every organization adopting AI: “Do not outsource the thinking.”
He had spent months teaching teams to let AI agents write plans and code without human review. The results were fast and the quality was poor. His team reversed course. His exact words: “I said don’t read the code. I was wrong.” The teams that started reviewing the output again, that brought their judgment back into the process, got better results than the ones running on autopilot. Not marginally better. Fundamentally better.
The lesson is not “slow down.” The lesson is that AI needs a thinking partner, not a rubber stamp. The agent produces the work. The human evaluates whether the work is right. Not whether it compiles, not whether it runs, but whether it solves the actual problem in a way that will hold up when it matters.
Klarna as Case Study
If you want to see what the taste gap looks like at corporate scale, look at Klarna.
In 2024, Klarna replaced roughly 700 customer service workers with AI. The CEO declared publicly that “AI can already do all of the jobs that we, as humans, do.” Resolution times dropped. Costs dropped. The board was delighted.
Then customer satisfaction collapsed. The AI gave generic answers. It could not handle anything requiring nuance, empathy, or judgment. Customers complained. Quality eroded. By early 2025, the CEO admitted publicly: “We went too far. We focused too much on efficiency and cost. The result was lower quality.”
Klarna started rehiring humans. Not because AI failed. Because AI succeeded at the wrong thing. It optimized for metrics that were easy to measure (speed, cost) while ignoring the ones that actually mattered (trust, resolution quality, the sense that someone understood your problem). There was nobody in the loop with the judgment to say: this is fast, but it is not good.
The Klarna reversal is now the story every board asks about when someone proposes AI-driven headcount changes. It is also the clearest illustration of the taste gap operating at scale: abundant capability, absent judgment, expensive consequences.
Recommended by LinkedIn
The same pattern plays out smaller every day. A dashboard measures AI adoption by activity metrics. A team reshapes its workflow to make the dashboard look good. The dashboard improves. The work does not. The dashboard was supposed to reflect value. Instead, value got reshaped to fit the dashboard. This is what happens when measurement replaces judgment: we optimize for what is easy to count and lose sight of what is hard to see but actually matters.
The Bottleneck Moved
The popular narrative frames AI as an abundance story: more capability, more output, more everything, better framed as is the bottleneck economy. AI capability is growing. But value only materializes if someone with judgment directs it.
The bottleneck used to be execution. It is now evaluation. And the people who thrive in a bottleneck economy are not the ones who generate the most. They are the ones who can tell the difference between output that looks right and output that is right.
Large enterprise AI rollouts consistently show the same pattern: excitement peaks in the first few weeks, then most users quietly stop. The ones who stay are not the most technical. They are the ones who had already developed the instinct to reject bad output and the language to explain why. The tool amplified what they already knew. It did not supply the judgment they lacked.
The most productive users of a generation tool are the ones who are best at not using its output. That is paradoxical enough to be important.
The Fifteen-Year Asset
Here is the inversion most people have not yet absorbed: the experience that feels like it is becoming obsolete is actually the thing that just became most valuable.
Consider the senior professional who has spent fifteen years in a domain. She knows which approaches fail under pressure. She recognizes when a proposal solves the stated problem while creating three unstated ones. For two years, she has been quietly anxious, watching junior colleagues with AI tools produce output that looks indistinguishable from hers.
It has never mattered more that it is not. The junior colleague does not yet know which outputs to reject. They accept output that looks right because they have not developed the sensitivity to distinguish “looks right” from “is right.” That sensitivity is fifteen years of experience compressed into instinct, and no tool on the market produces it.
The experience did not depreciate. What it is useful for changed. It used to mean “I know how to build this.” Now it means “I know whether this is good.” Evaluation is the scarce resource in a world of abundant generation.
Extraordinary people in organizations often operate at twenty-five percent of their capacity, because most of their time goes to coordination overhead. AI is reducing much of that overhead. The expert’s judgment suddenly gets applied to four times the volume of work. Upskilling people with accumulated taste is almost always a better move than hiring someone who can prompt well but cannot evaluate the result.
Every Rejection Is a Document
There is a practical dimension most organizations are missing. Every time an experienced person rejects AI output and explains why, they create institutional knowledge. Not the kind that sits in a wiki nobody reads. The operational kind: a documented standard for what “good” means in this specific context.
Over time, these rejections accumulate into a living record of the organization’s taste. What do we accept? What do we refuse? Why? This record is precisely the kind of specification that makes future AI output better. Judgment improves specification, which improves generation, which demands more refined judgment. It is a compounding loop.
The Taste Gap Is the K-Shape
The K-shaped economy runs on this gap. On one arm, people generate output and accept whatever comes back. They ship fast. And increasingly, what they ship does not survive contact with anyone who understands the domain. On the other arm, people generate the same volume and then evaluate it with the accumulated judgment of years. What they ship holds up.
The difference is not technical skill. It is taste: the ability to reject competent work that is not good enough, and to explain why.
The reassuring part is that taste is not something we need to acquire from scratch. Most experienced professionals already have it. It is the thing that makes them experienced. The shift is in recognizing that this judgment, the instinct that once felt like a soft skill, is now the hard skill.
AI has capability without taste. We have taste that took years to build. That is a partnership, not a competition. The question is not whether our judgment still matters. It is whether we have the confidence to use it: to look at a polished, perfectly structured piece of AI output and say, no, not this, here is why.
That “here is why” is worth more than we think.
This is part of “The Hitchhiker’s Guide to the K-Shaped Economy.” Previous: “The Spec Is the Product” on specification. Next: “Think in Pieces” on decomposition.
The Klarna example is interesting. I think equipping customer service agents with real time insights and communication enablers could be much better for now than automating everything.
Herman Geldenhuys - If judgment is now the bottleneck, how should organizations measure and reward good judgment without turning it into another metric that AI can game?