The Interface Became The API

Shivanath Devinarayanan

Published Apr 30, 2026

Codex did not just get better at coding.

It crossed a category line.

On April 16, OpenAI said Codex could operate a computer alongside you, see, click, type, use apps, generate images, remember preferences, and take on ongoing work. The most important part was not the plugin count or the browser. It was the computer-use line: Codex can now use apps on your computer by seeing, clicking, and typing with its own cursor, while multiple agents work in parallel on a Mac without interrupting your work.

That sounds like a product update.

It is not.

It changes the automation surface..

For the last decade, enterprise automation made one quiet assumption: the system had to expose a clean interface before machines could help. API first. Connector first. Workflow first. If the vendor did not cooperate, the work stayed human. If the internal tool was too old, too weird, or too politically owned, the work stayed human. If the process lived across a dashboard, a spreadsheet, a PDF, a ticket, and an approval screen, the work stayed human.

Codex points at a different bargain

If the software has a screen, an agent may be able to operate it.

That is the reason this matters for The Dark Factory. Dark factories do not appear because every system becomes elegant. They appear when enough ugly work becomes executable.

The old automation stack was built around APIs, scripts, RPA, connectors, and data pipelines. All of that still matters. But the work that drains an operations team rarely sits in one clean endpoint. It sits in the long tail: vendor portals, exception queues, admin consoles, legacy ERPs, browser tabs, internal CRUD apps, Slack threads, email chains, spreadsheets, and human judgment calls that never made it into a workflow diagram.

That long tail is where automation has been weakest.

Now the interface itself is becoming executable.

OpenAI's GPT-5.4 release makes the model-level case. OpenAI says GPT-5.4 is its first general-purpose model with native computer-use capabilities, and reports 75.0% on OSWorld-Verified, above the reported 72.4% human baseline. We should treat that carefully because it is OpenAI's own benchmark framing, but the direction is clear: visual computer control is no longer a demo lane. It is becoming a product lane.

The video that kicked off this edition made a useful distinction: the model is the brain, but the product needs a body. That framing is right.

Claude and Codex are not just two coding tools racing on a leaderboard.

They are two different theories of the agent body.

Claude's body is more structured. Anthropic has Cowork, Claude Code, connectors, MCP, role-based controls, and a steadily expanding enterprise surface. Claude Cowork became generally available on macOS and Windows through Claude Desktop on April 9, 2026. Anthropic also introduced computer use in Cowork and Claude Code as a research preview in March, letting Claude open files, run developer tools, point, click, and navigate on screen.

The MCP thesis is powerful because it gives agents clean rails. MCP describes itself as an open-source standard for connecting AI applications to external systems, tools, data sources, and workflows. Anthropic's remote connector docs make the enterprise shape clear: Claude can connect to external tools and data sources through remote MCP, but those servers have to be reachable, configured, governed, maintained, and trusted.

That is a good architecture when the world cooperates.

But the enterprise world does not always cooperate..

Most companies are not short of elegant target architectures. They are short of operating leverage inside systems that were never designed for agents. They have one vendor portal from 2014. One finance approval tool everyone hates. One internal dashboard that only three people understand. One customer operations workflow that moves through five screens and two spreadsheets because the source systems disagree.

That is where Codex's body gets interesting.

OpenAI bought Software Applications Incorporated, maker of Sky, in October 2025. OpenAI described Sky as a Mac interface that understands what is on your screen and can take action in your apps. Nick Turley said Sky's deep macOS integration would help ChatGPT "get things done." Ari Weinstein described Sky as an AI experience that floats over the desktop to help people think and create.

That acquisition now reads less like a talent deal and more like a strategic clue. OpenAI is not waiting for every application to publish an agent interface. It is teaching the agent to use the human interface.

Recommended by LinkedIn

How to Stay Relevant When AI Writes Your Code

Haroon Choudery 3 months ago

Everyone's debating whether AI makes developers…

Jeff Chen 1 month ago

⚡Why Tiny AI Models Might Beat Big LLMs Soon

LambdaTest is now TestMu AI 8 months ago

This is not a reason to declare one vendor the winner.

It is a reason to be more precise about where each pattern wins.

If the work is structured, permissioned, repeatable, and supported by good connectors, Claude plus MCP-style architecture can be cleaner. It gives teams better boundaries. It gives security teams objects to inspect. It gives platform teams something closer to normal software engineering.

If the work lives across messy screens and weak integrations, Codex-style computer use has the reach advantage. It can work where no API exists. It can drive the same surface the operator drives. It can move across the long tail without waiting for the vendor roadmap.

That is the operating question leaders should ask:

Does this workflow need a cleaner integration, or does it need a better operator?

In Edition 3, we argued that skills were becoming the new infrastructure because small, portable bundles of procedure were starting to replace tribal knowledge. In Edition 6, we argued that agents need protocols before they can be trusted with real work. In Edition 7, we argued that agents are a data-shaped problem. This edition sits on top of all three.

Computer use gives agents reach. Skills give them procedure. Protocols give them control. Data gives them truth.

You need all four.

Governance sidebar

Chronicle, memory, and screen-aware context are not just convenience features. They are governance events.

If an agent can see the screen, learn the workflow, remember context, and operate apps, then we need to answer a harder set of questions. What can it see? What can it store? What can it act on? Which actions need confirmation? Which apps are blocked? Which memories are inspectable? Which workflows are logged? Which failures are reversible?

Gartner's warning is useful here. Gartner predicts more than 40% of agentic AI projects will be canceled by the end of 2027 because of cost, unclear business value, or inadequate risk controls. That is not a reason to stop. It is a reason to stop confusing access with readiness.

The Mumbai dabbawala analogy fits better than a software diagram. The magic is not just that lunch moves across the city. The magic is the routing, handoff, error correction, local knowledge, and accountability. Without that system, you just have people carrying boxes.

Without governance, screen-operating agents are just cursor movement at scale.

Gartner also predicts 40% of enterprise applications will include task-specific AI agents by the end of 2026, up from less than 5% in 2025. Anushree Verma at Gartner said AI agents will move from task and application-specific agents toward agentic ecosystems. That forecast matters because the enterprise will not adopt one agent pattern. It will adopt many.

Some agents will live inside apps. Some will live in connectors. Some will live in the browser. Some will operate the desktop. Some will wake up on events. Some will work from memories. Some will be tightly scoped. Some will be dangerously broad unless we design the operating model around them.

The dark factory is not one tool.

It is the orchestration of all those bodies around real work..

Anthropic's own Claude Code momentum proves the demand is real. In its Bun acquisition announcement, Anthropic said Claude Code reached $1 billion in run-rate revenue in six months. Mike Krieger described Bun as the kind of technical excellence Anthropic wants to bring into its infrastructure. That is not a side market. That is the market telling us developer and operator workflows are becoming agentic faster than traditional enterprise planning cycles can process.

So the question is not whether Codex or Claude is smarter.

That is the shallow debate.

The better question is: which body can reach the work, and what controls are wrapped around it?

If the work is already structured, use structured rails. If the work lives in the long tail, test computer use. If the work touches money, customers, regulated data, or reputation, build confirmation and audit before scale. If the workflow cannot be measured, do not automate it yet.

The interface became the API.

Now we have to decide which parts of the enterprise are ready to be operated by something other than a person..

The Dark Factory

2,003 followers

+ Subscribe

Shivanath Devinarayanan 3d

The practical test I would run first: pick one workflow that touches 3 or more screens, has low regulatory exposure, has a clear success metric, and currently burns operator time. That is where screen-operating agents become real fastest.

To view or add a comment, sign in

The Interface Became The API

Shivanath Devinarayanan

Codex points at a different bargain

Recommended by LinkedIn

Governance sidebar

The Dark Factory

2,003 followers

More articles by Shivanath Devinarayanan

Others also viewed

OpenCode: Field Notes from Multi-Model Orchestration

How to Build a Custom Assistant Using the Chain Prompting method (Frontend and Backend)

Code Container: Isolating AI Coding Harnesses Without Losing Your Mind

AI Generated Documentation. Salvation?

When Copilot and Codex Failed Me, I Built My Own AI Stack Locally

The YAML-First Philosophy: Building Production AI Pipelines with Codified Doctrine

Artificial Intelligence - Digest #7

Who's Using Claude? A Deep Dive Into Anthropic's Growing AI Empire

Your AI Coding Stack is Already Compromised — You Just Don’t Know It Yet

Explore content categories

Codex points at a different bargain

Recommended by LinkedIn

Governance sidebar

The Dark Factory

2,003 followers

More articles by Shivanath Devinarayanan

Dorsey Replaced 4,000 Managers. The Ones He Kept Are Worth More.

Memory Is a Balance Sheet Item Now. You Just Haven’t Noticed Yet.

Five Vendors Shipped Agent Governance in One Week. That Is the Inflection Point.

97% Deployed AI Agents. 75% Admit Their Strategy Is "For Show." The Organizational Meltdown Nobody's Discussing.

Ship 10x Faster Without Losing the Plot. The Practitioner's Guide to Comprehension at Speed.

Why Should I Ever Log Into Salesforce Again?

The Checklist: Roles Converging in the Agentic Era

Mulesoft Agent Fabric : The Missing Layer: Why Every Enterprise Deploying AI Agents Needs Air Traffic Control

92% of Developers Use AI Daily. 45% of That Code Has Security Holes. The Quality War Just Started.

25,000 Jobs Erased. 9,000 Created. And the Checklist That Determines Which Side You're On.

Others also viewed

OpenCode: Field Notes from Multi-Model Orchestration

How to Build a Custom Assistant Using the Chain Prompting method (Frontend and Backend)

Code Container: Isolating AI Coding Harnesses Without Losing Your Mind

AI Generated Documentation. Salvation?

When Copilot and Codex Failed Me, I Built My Own AI Stack Locally

The YAML-First Philosophy: Building Production AI Pipelines with Codified Doctrine

Artificial Intelligence - Digest #7

Who's Using Claude? A Deep Dive Into Anthropic's Growing AI Empire

Your AI Coding Stack is Already Compromised — You Just Don’t Know It Yet

Explore content categories