The Review Queue is Not the Problem

The Review Queue is Not the Problem

Hi there,

Last time I wrote that AI made it trivial to produce code and did nothing to reduce the risk of running it. Amazon confirmed it publicly: two outages in three days traced to AI-assisted changes, and a mandated 90-day reset across its most critical systems. I covered that in the last newsletter.

But Amazon is an extreme. The more telling story is the one happening quietly in engineering organizations everywhere.


The Absorption Limit

When Cursor acquired code review startup Graphite in December, it paid well over Graphite's last private valuation of $290 million. The strategic rationale was straightforward. Cursor CEO Michael Truell said the review process had "remained largely unchanged" even as AI dramatically accelerated code writing. That gap was becoming the new constraint on shipping software.

 The Cursor-Graphite deal is essentially a market signal translated into a balance sheet entry. The industry has absorbed the lesson that AI writes code faster. It is now placing expensive bets on what comes next: the humans in the review queue.

That framing is understandable. It is also wrong. 

The review queue is not the bottleneck. It means the system has hit the absorption limit: more code, generated faster, arriving at a checkpoint that was never designed to absorb this volume. On teams with high AI adoption, developers complete 21% more tasks and merge 98% more pull requests, but PR review time increases 91%.

The instinct is to treat that 91% as the problem to solve. Speed up review, automate it, get it out of the way.

But the review queue is not producing risk. It is simply revealing risk.


Measuring the Risk

The body of knowledge around AI-assisted code is growing. Veracode tested 80 coding tasks across more than 100 large language models and found that 45% introduced OWASP Top 10 vulnerabilities, with Java hitting a 72% security failure rate. A separate study found that AI-written code surfaces 1.7 times more issues than human-written code, and nearly half of developers say debugging AI output takes longer than fixing code written by people.

This is the actual problem. Not that review is slow. That what is arriving at review is harder to trust, at a volume human reviewers were not built to absorb.

When Cursor's CEO describes review as a bottleneck "to moving even faster," he is diagnosing the friction correctly and drawing the wrong conclusion. The friction is not the problem. It is the only thing standing between accelerated code generation and accelerated production failures.

To be fair to Truell, the problem he is trying to solve is real and hard. Review has always depended on something that does not exist in any repository: the accumulated judgment of everyone who has ever debugged the system at three in the morning, made the call that saved a production incident, or quietly backed out a change because something felt wrong they couldn't quite articulate. That knowledge lives in people. It does not live in diffs. Graphite is a genuine attempt to make review faster and smarter. The problem is that faster and smarter review of code that is harder to trust, at higher volume, in systems that are increasingly difficult to fully comprehend, is not the same thing as safer software. It is a throughput improvement applied to a comprehension problem.

According to the 2026 State of Software Delivery report, main branch success rates have dropped to 70.8%, a five-year low. That means that nearly three in ten merges are now failing. That number is not a review problem. It is a risk absorption problem. More code, higher failure rates, and a growing pressure to move the human checkpoint out of the way.


Course Correction

The correct response is not to eliminate the checkpoint. It is to increase the system’s capacity to absorb change. That is a different problem entirely.

It is not solved by adding reviewers or speeding up approvals. It requires building systems where safety scales with output. Test coverage that evolves alongside code generation. Deployment pipelines that route changes by risk, not convenience. Observability that explains why a change was considered safe, not just what failed. Reliability treated as a property of the architecture itself, not a process applied at the end.

This is where the industry begins to split.

One path treats review as friction to remove. Those teams will move faster in the short term. They will clear their queues, increase throughput, and push more change into production. They will also discover, repeatedly, that they have exceeded their absorption limit. The result will not be a single catastrophic failure, but a steady accumulation of instability that eventually forces a correction.

The other path treats the review queue as a signal. Those teams will look slower at first. They will merge less, question more, and invest in the unglamorous work of increasing their system’s capacity for safe change. Over time, they will be the only ones able to increase velocity without increasing risk.

The review queue is not the problem. It is the system telling you that you are already at the limit. What happens next depends on whether you choose to ignore it, or build beyond it.


If this was useful, you can subscribe to Essential Complexity here 

Happy coding,

Joe Leo Founder, Def Method

We modernize Rails systems when breaking things is expensive.


We see a similar pattern on the operations side. Nothing looks broken, but things start backing up, and the instinct is to push it through. In reality it’s usually a system taking on more change than it can handle. If you force it, the risk just shows up later somewhere more expensive. The real work is making sure the system can absorb the change without drifting.

Like
Reply

When the pace of input (AI, ideas, change) outstrips a team’s ability to process with clarity and confidence, the system doesn’t break loudly—it slows quietly. Things stack. Decisions feel heavier. People hesitate just a little more. And the instinct is always the same: go faster. But speed without absorption just moves the risk downstream. I see this with leadership teams all the time, including mine!. More initiatives, more ideas. I am sometimes exhausted before the real work has even begun. The real shift is exactly what you said Joe Leo: Not “How do we move faster?” But “Can we actually handle the rate of change we’re creating?”

Good perspective Joe Leo. It's tempting to get too focused on a few aspects of using AI, when the real play is to scale and improve all aspects of your business.

Joe, your points are well taken. From my perspective, reducing legal and reputational risk is paramount. While AI is here to stay, how it's implemented will make all the difference. it's no secret that courts have sanctioned attorneys who don't corroborate case law found by AI. My motto is Verify Then Verify Again Then Trust.

Joe Leo Yes. And AI can produce more errors, or hide them behind confident-looking output. So the real drag is not just volume. It is trust. Reviews get slower because people are not just checking code, they are checking whether the code can be believed.

To view or add a comment, sign in

More articles by Joe Leo

  • Already Inside

    Hi there, In my last article, I argued that detection is losing the race. AI-generated code is entering production…

    8 Comments
  • Detection Won’t Save You

    Hi there, For the past year, the debate about AI-generated code and security has been conducted almost entirely in the…

    7 Comments
  • The Reliability Gap

    Hi there, Last week, Amazon ordered a 90-day code safety reset across its most critical engineering systems. The…

    8 Comments
  • The Bull Case for Ambition

    Hi there, Citrini Research’s The 2028 Global Intelligence Crisis makes a serious claim: intelligence itself may be…

    2 Comments
  • The Platform Illusion: Why AI Won’t Save Incremental Software

    Hi there, You don’t need a Wall Street Journal subscription to know the markets have been volatile lately. AI may not…

    11 Comments
  • The Price of Agency

    Hi there, My two favorite topics lately have been the explosion in popularity for OpenClaw and the reactions to Gas…

    6 Comments
  • What Happens When a Productive Society Forgets How New Work Is Born

    Hi there, Ben Thompson recently wrote an excellent piece on Stratechery about AI and the human condition. One section…

    4 Comments
  • Stop Moving Faster and Move Forward

    Hi there, In a December 28 Wall Street Journal article, we’re told that many companies plan to stay flat or trim staff…

    3 Comments
  • Stop Using AI to Go Faster. Start Using It to Go Bigger.

    Hi there, Everyone in the software industry is doing AI wrong. Myself included.

    5 Comments
  • The Biggest Shift in Our Industry Isn’t AI

    Hi there, For the last three years, almost every conversation in the software world has revolved around one topic:…

    4 Comments

Others also viewed

Explore content categories