OpenAI Claude Merges with Codex and Cursor in Unplanned Coding Stack

Cursor, Claude Code, and Codex are merging into one AI coding stack nobody planned OpenAI just shipped an official plugin that runs inside Anthropic's Claude Code. Not a workaround. Not a community hack. An Apache 2.0-licensed plugin from OpenAI, installed directly into a competitor's terminal. Same week, Cursor 3 launched a rebuilt interface that treats the code editor as secondary. The default view is now an Agents Window for managing fleets of coding agents across repos and environments. Google's Antigravity reached the same conclusion with its Manager Surface. I wrote about what this means for developers - https://lnkd.in/g8QgDDhh Three layers are forming. Orchestration on top, where you manage and route agents. Execution in the middle, where coding agents write, test, and commit code. Review at the bottom, where a different model from a different provider challenges the code the first one wrote. The interesting part is the review layer. When Claude writes code and Codex reviews it, you get independent scrutiny. Different training data, different blind spots. You are no longer asking someone to grade their own homework. Nobody designed this stack. Developers assembled it because no single tool covers everything. Claude for precision on complex refactors. Codex for throughput on parallel tasks. Cursor as the control plane on top. We went through the same thing with infrastructure. Terraform, Docker, Kubernetes. Not one tool to rule them all. Composable layers that got better together. Are you already running multiple coding agents in the same workflow, or still picking one and hoping it covers everything? #AIcoding #DevTools #CodingAgents #SoftwareEngineering

2 Comments

Ajay Prabandham 2w

Redundancy and Adverserial Review seems an interesting way to approach the Reliability issue, Janakiram MSV - but at the cost of - Cost? If multiple Agents (or is it LLMs? Please do clarify) run within the same review layer, and the idea is "adversarial redundancy" - for the output of one agent, to be reviewed by another: 1. In any workflow execution driven by the orchestration layer - are all agents required to invoke the same singular LLM - throughout that workflow? Or is there a choice? 2. If the output of Agent-A is reviewed by Agent-B, and found to be "not satisfactory" (depending on the validations & guardrails) - if Agent-C (from a different vendor) is available, do we ask Agent-C to review the output again? Or, does Agent-C get a chance to generate output afresh? 3. What about Token Usage, Limits, and Costs? This is by far, the biggest hyperparameter that decides Agent configuration and orchestration, putting a premium on workflow breakup, and extent of LLM utilisation.

Alex Rogov 2w

The infrastructure parallel is perfect. Nobody "chose" Terraform + Docker + Kubernetes either — engineers assembled it because each tool solved a specific pain point better than any all-in-one solution. My daily stack proves the same pattern: Claude Code for deep architectural refactors (it reads my CLAUDE.md files and respects module boundaries), Cursor for real-time IDE work where speed matters more than depth, and Codex for background parallel tasks. Three tools, three different cognitive modes. The missing layer that ties it together: context engineering. CLAUDE.md files at the project root give every tool the same architectural context. Without that shared context layer, you're just running three disconnected AI tools. With it, they feel like one coherent system. The next platform play will be whoever builds the "Kubernetes for AI coding" — the orchestration layer that connects these tools with shared state.

See more comments

To view or add a comment, sign in

More Relevant Posts

Shivdeep (Shiv) Nancherla
2w
Report this post
As the adage in Software Engineering goes, never test your own code. In the AI-driven world of coding, the new rule is: never use the model that you used to write code to review it. I just finished reading an article by #Janakiram on how the AI coding market is not consolidating into one "winner", but is instead evolving into a specialized, three-layer stack: 1️⃣ The Orchestration Layer: Cursor 3 (specifically the new "Glass" interface) is moving beyond the editor to become a control plane for managing fleets of parallel agents. 2️⃣ The Execution Layer: This is the engine room. While Claude Code is winning on nuanced reasoning, OpenAI Codex is being tapped for high-throughput, asynchronous tasks. 3️⃣ The Review Layer: This is the game-changer. By using the new codex-plugin-cc, developers are now using Codex to provide independent, "adversarial" reviews of code written by Claude. #Janakiram makes a compelling point: we are moving away from walled gardens toward interoperability. The biggest players in AI have start building plugins for their competitors, the future seems to be about composition and not competition.
Janakiram MSV

Analyst | Advisor | Architect
2w

Cursor, Claude Code, and Codex are merging into one AI coding stack nobody planned OpenAI just shipped an official plugin that runs inside Anthropic's Claude Code. Not a workaround. Not a community hack. An Apache 2.0-licensed plugin from OpenAI, installed directly into a competitor's terminal. Same week, Cursor 3 launched a rebuilt interface that treats the code editor as secondary. The default view is now an Agents Window for managing fleets of coding agents across repos and environments. Google's Antigravity reached the same conclusion with its Manager Surface. I wrote about what this means for developers - https://lnkd.in/g8QgDDhh Three layers are forming. Orchestration on top, where you manage and route agents. Execution in the middle, where coding agents write, test, and commit code. Review at the bottom, where a different model from a different provider challenges the code the first one wrote. The interesting part is the review layer. When Claude writes code and Codex reviews it, you get independent scrutiny. Different training data, different blind spots. You are no longer asking someone to grade their own homework. Nobody designed this stack. Developers assembled it because no single tool covers everything. Claude for precision on complex refactors. Codex for throughput on parallel tasks. Cursor as the control plane on top. We went through the same thing with infrastructure. Terraform, Docker, Kubernetes. Not one tool to rule them all. Composable layers that got better together. Are you already running multiple coding agents in the same workflow, or still picking one and hoping it covers everything? #AIcoding #DevTools #CodingAgents #SoftwareEngineering
Like Comment
To view or add a comment, sign in
Allen Westley, CSM, CISSP, MBA
2w
Report this post
The Persistent Knowledge Dilemma is no longer just a model problem. It is becoming a workflow paradox. Cursor, Claude Code, and Codex are not simply converging into a better developer toolkit. They are normalizing a multi-agent decision stack where context persists, authority diffuses, and human oversight becomes ceremonial. Cursor, Claude Code, and Codex are merging into one AI coding stack nobody planned - The New Stack https://lnkd.in/eCpDT9Ub

Cursor, Claude Code, and Codex are merging into one AI coding stack nobody planned https://thenewstack.io

1 Comment
Like Comment
To view or add a comment, sign in
Oleksander Babich
3w
Report this post
The analogy to distributed systems is spot on. 🌐 Just like we don't let a database service self-validate its own corrupted shards, we shouldn't trust a single LLM to find its own logic hallucinations. Using a 'Rubber Duck' from a different model family (like GPT-5.4 vs. Claude) introduces the necessary 'diversity of thought' to catch edge cases. This kind of cross-model consensus is the only way we get to true production-grade AI agents. 🤖

Valerii Shtuvbeinyi

Solution Architect | .NET Expert | Azure & Kubernetes | GraphQL & gRPC | DDD | Distributed Systems
3w

Anyone who's used AI coding agents knows this: 𝘁𝗵𝗲 𝗺𝗼𝗱𝗲𝗹 𝘁𝗵𝗮𝘁 𝘄𝗿𝗶𝘁𝗲𝘀 𝘁𝗵𝗲 𝗰𝗼𝗱𝗲 𝘀𝗵𝗼𝘂𝗹𝗱𝗻'𝘁 𝗯𝗲 𝘁𝗵𝗲 𝗼𝗻𝗹𝘆 𝗼𝗻𝗲 𝗿𝗲𝘃𝗶𝗲𝘄𝗶𝗻𝗴 𝗶𝘁. GitHub just automated that principle. Copilot CLI's new "Rubber Duck" feature pairs your primary agent (Claude) with an independent reviewer (GPT-5.4) from a completely different model family. It kicks in automatically — after planning, after complex implementations, and after writing tests. The result? Claude Sonnet + Rubber Duck closes 𝟳𝟰.𝟳% of the performance gap between Sonnet and Opus on SWE-Bench Pro. This is exactly what we've been doing manually — switching models to verify critical outputs. But doing it manually is slow and inconsistent. Now it's built into the workflow. Think about it from an architecture perspective: this is the same pattern we use in distributed systems. You don't validate a transaction with the same service that created it. Cross-model verification is just consensus for AI agents. The bigger signal here: the industry is moving from "one model does everything" to 𝗺𝘂𝗹𝘁𝗶-𝗺𝗼𝗱𝗲𝗹 𝗼𝗿𝗰𝗵𝗲𝘀𝘁𝗿𝗮𝘁𝗶𝗼𝗻 by default. And that changes how we think about AI tooling costs, reliability, and trust. Are you already switching models for verification, or still trusting a single model end-to-end? #GitHubCopilot #AIAgents #DotNet --- 🔗 Source: https://lnkd.in/dJCiV8ht

GitHub Copilot CLI combines model families for a second opinion https://github.blog
Like Comment
To view or add a comment, sign in
Florian Woerner
2w
Report this post
Just read a fascinating piece on the evolution of AI coding assistants! 🚀 GitHub is introducing a new experimental feature in GitHub Copilot CLI called "Rubber Duck." The core idea? It combines different AI model families (like Claude and GPT) so they can effectively peer-review each other's work. Instead of a single model checking its own code and repeating its own training biases, Rubber Duck acts as an independent reviewer to provide a true "second opinion" on architectural plans, edge cases, and tests. By catching blind spots early, it stops small errors from compounding into massive bugs—especially in complex, multi-file tasks. Will definitely try it out! 🦆💻 Read the full article here: https://lnkd.in/ejKMx2CV #GitHubCopilot #ArtificialIntelligence #SoftwareEngineering #GenerativeAI #DeveloperTools #CodingAgents #TechNews

GitHub Copilot CLI combines model families for a second opinion https://github.blog
Like Comment
To view or add a comment, sign in
Patrick Woodhead
3w
Report this post
The recent exposure of Claude Code's codebase raises interesting questions about code licensing. Anthropic accidentally shipped a source map in an npm package that let anyone reconstruct the original source: 512,000 lines across ~1,900 files. As soon as the code was public, people were getting AI to rewrite it in different languages. One developer ported the core architecture to Python before sunrise the same day. Anthropic enacted a wave of DMCA takedowns on GitHub repos, though the initial notice accidentally swept ~8,100 repositories, including legitimate forks of their own public repo. They retracted most of it, narrowing to one repo and 96 forks. But my bet is that many people still have the codebase on their local devices, mirrored to decentralised platforms, or saved as torrents, well beyond DMCA's practical reach. And locally, folks can get LLMs - also running locally or just not Claude - to rewrite the codebase in other languages with no one the wiser. All of this should make companies with non-permissive licenses very nervous. If your codebase is public but licensed under copyleft (GPL, AGPL) or a source-available license (BSL, SSPL), what is stopping someone from using AI to rewrite it in another language and release it under MIT or Apache? With copyleft, the legal question is whether an AI rewrite counts as a "derivative work." With source-available licenses the restrictions concern the usage of the code itself. In both cases, the question is how can you prove the new work is a derivative of the original? What is the definition of cleanroom in a world where replicating code from a spec is trivial and instant for agents? Traditional cleanroom reverse engineering requires the implementer to have zero exposure to the original source. When an AI has been trained on that code, or when the person prompting it has read it, the cleanroom argument gets cloudy. This is all very new. Over the course of 2025, coding agents crossed a threshold of capability that makes this kind of reimplementation fast and cheap. And AFAIK, there haven't yet been any legal cases that establish precedent around AI-assisted cleanroom reimplementation of a codebase. I expect we will be hearing a lot more about this over the coming months.

1 Comment
Like Comment
To view or add a comment, sign in
Mohammad Awwaad
1w
Report this post
Everyone wants "AI Agents" reviewing their code. But almost no enterprise security team will let you send proprietary source code to OpenAI or Anthropic APIs. 🚫🔒 So, if you want Agentic workflows in the enterprise, you have to build them locally. I just published a new step-by-step masterclass on how to build a 100% private, autonomous AI Pull Request Reviewer directly on your machine—at zero cost. In this tutorial, we don’t just write "hello world" scripts. We build a deterministic, enterprise-grade architecture that catches bugs, evaluates microservice flaws, and posts the review natively back into the GitLab UI. In the article, I break down: 🏗️ The Stack: Bridging Local Docker GitLab with Ollama (Qwen 2.5 Coder) 🧠 The Brains: Using LangGraph State Machines to enforce absolute analytical strictness (Hint: Creativity is the enemy of PR reviews!) 🔌 The Connection: Navigating the bleeding edge of the Model Context Protocol (MCP) and how to write secure REST fallbacks. ⚡ The Ghost Thread: Outsmarting Webhook timeout constraints using FastAPI background tasks. If you've been struggling with Python environment traps or dealing with immature open-source AI standards, this guide includes every single terminal command and script you need to get it running by lunch. 👇 Link to the full Dev.to tutorial in the first comment! 👇 #SoftwareEngineering #ArtificialIntelligence #DevOps #GitLab #LangGraph #LLMs #Ollama #SoftwareArchitecture #AgenticAI #MCP
3 Comments
Like Comment
To view or add a comment, sign in
Hùng Hoàng Anh
3w
Report this post
🚀 Meet Qwen Code Rust (QCR) Cold start in ~15 ms. Yes, really. If you’ve been waiting for a blazing-fast, no-nonsense AI coding agent, this is it. Built entirely in Rust, QCR delivers speed, portability, and serious capability in a tiny footprint. 💡 Highlights: • ⚡ ~15 ms cold start • 📦 Single 15MB binary — no heavy setup • 🔌 Multi-provider support: OpenAI, Anthropic, Gemini, DashScope, Qwen, Copilot, Azure, Ollama • 🧰 45+ built-in tools for real-world workflows • 🔗 Supports ACP / MCP / AG-UI This isn’t just another AI dev tool — it’s designed for developers who care about performance, control, and flexibility. Whether you're building locally, integrating across providers, or optimizing your dev workflow, QCR is worth a look. 👉 https://lnkd.in/gk2F5Qiw Curious to hear what you think — is ultra-fast local AI tooling the future of coding workflows?

GitHub - hscale/qwen-code-rust: Blazing-fast terminal AI coding agent written in Rust (qcr) — single 15MB binary, multi-provider (OpenAI/Anthropic/Gemini/DashScope/Qwen/Copilot/Azure/Ollama), 45+ built-in tools, ACP/MCP/AG-UI support github.com
Like Comment
To view or add a comment, sign in
Sebastian Schkudlara
2w
Report this post
Shipped: Lope. Open source, MIT, v0.3.0. One AI CLI writes the code. A different AI CLI reviews it. Majority vote decides if the phase ships. No single-model blindspot. 12 built-in validators out of the box: Claude Code, OpenCode, Gemini CLI, OpenAI Codex, Mistral Vibe, Aider, Ollama, Goose, Open Interpreter, llama.cpp, GitHub Copilot CLI, Amazon Q. Add any other AI backend via 5 lines of JSON config. Zero Python dependencies. Pure stdlib. No pip, no venv, no broken wheels. Lope is an autonomous sprint runner. It drafts a structured plan, negotiates it with your validator ensemble, executes phase by phase with validator-in-the-loop retry, and produces a scorecard. Works for engineering, business (marketing, finance, ops, consulting), and research. Same loop, different validator role. What's new in v0.3.0: two-stage validator review (spec compliance then code quality), verification-before-completion gate that auto-downgrades rubber-stamp PASSes, no-placeholder lint on drafts, SessionStart hook so agents know lope exists, and a "using-lope" auto-trigger skill so users don't have to type slash commands. Describe multi-phase work in natural language and the agent invokes lope for you. Install is one line you paste into any AI agent: Read https://lnkd.in/d_XiqRNK and follow the instructions to install lope on this machine natively. Repo: https://lnkd.in/d7DqF_CP Star it, break it, open issues. #AIEngineering #OpenSource #DeveloperTools #LLM #CodingAssistants

GitHub - traylinx/lope: Autonomous sprint runner with multi-CLI validator ensemble. Any AI CLI implements, any AI CLI validates. github.com
Like Comment
To view or add a comment, sign in
srinivasarao kaza
4d
Report this post
I benchmarked 26 AI coding tools so you don't have to. Here's the complete 2026 scorecard 👇 ━━━━━━━━━━━━━━━━━━━ 🏆 WINNER BY CATEGORY ━━━━━━━━━━━━━━━━━━━ 🖥️ IDE & VS Code agents → Claude Code (96) ⌨️ Terminal / CLI agents → OpenAI Codex (90) 🚀 App builders / vibe coding → v0 by Vercel (78) 🔌 IDE extensions / copilots → GitHub Copilot (89 overall) 🤖 Autonomous agents → Factory (82) 🧠 Open-source / local → DeepSeek (85) ━━━━━━━━━━━━━━━━━━━ 💡 7 THINGS THAT SURPRISED ME ━━━━━━━━━━━━━━━━━━━ ① 20 of 26 tools have a free tier. No excuses. ② Claude Code scores 96 accuracy — highest of any tool — but no free tier. ③ GitHub Copilot is still the smart default. $10/mo, works everywhere, 37% market share. ④ OpenAI Codex hit 3M weekly users in under a year. Open-source, free, 4× more token-efficient than Claude Code. Most underrated tool of 2026. ⑤ Cheapest capable setup: OpenCode + DeepSeek API = ~$2–5/mo total. ⑥ Ollama + Llama 4 = fully private, fully offline, zero cost. The only setup where no data ever leaves your machine. ⑦ Tabnine is the only option for regulated industries — finance, healthcare, defence — where code cannot leave your infrastructure. ━━━━━━━━━━━━━━━━━━━ 🎯 RIGHT TOOL BY SCENARIO ━━━━━━━━━━━━━━━━━━━ → Best accuracy → Claude Code → Best daily driver → Cursor → Start for free → GitHub Copilot → Free terminal, token-efficient → OpenAI Codex → Free terminal, massive context → Gemini CLI → Cheapest stack → OpenCode + DeepSeek → React UIs fast → v0 by Vercel → No-code / founder → Lovable or Replit → Full privacy offline → Ollama + Llama 4 → Regulated enterprise → Tabnine The question in 2026 is no longer "should I use AI coding tools?" It's "which combination is right for my workflow?" Drop your stack in the comments 👇 Repost to help other developers find the right tools ♻️ #AIcoding #DeveloperTools #SoftwareEngineering #AI #Cursor #ClaudeCode #GitHubCopilot #OpenAICodex #GeminiCLI #Llama #DeepSeek #Productivity #BuildInPublic #Tech2026
Like Comment
To view or add a comment, sign in
George Ivan
3w
Report this post
Spent the last week debugging RLM Kit (https://lnkd.in/evBXm-ef), and it left me with two practical reminders about AI coding tools. AI assistants can be incredibly useful. They can also be very confident while missing the obvious. ❗Sometimes the answer is in the logs, not in a “smarter” prompt strategy For one issue, I was using Claude Code, Claude Coworker, and Codex. All three pushed in a similar direction: improve system prompts, add tuning logic, refine the recursive flow, make the setup more sophisticated. What none of them suggested early was the simplest thing: check the vLLM logs (I am developing on Mac, while DGX Spark is hosting vLLM models). Once I looked at the runtime error, the problem became clear: 📄 This model's maximum context length is 8192 tokens. However, you requested 8193 tokens (8065 in the messages, 128 in the completion). That explained more than all the “smart” suggestions. The issue was not prompt quality, but a collision b/w growing prompt history, a hard min output-token reservation, and vLLM model rejecting requests that exceeded the model context window. The real fix was closer to this: ➡️ headroom = context_window - estimated_prompt_tokens ➡️ if headroom < min_output_tokens: ➡️ raise ValueError("Context window exhausted") Not glamorous. Just correct. 👨🎓 The lesson: before adding sophistication, inspect actual system behavior. ‼️ AI without guardrails is where the real damage starts The second problem happened between two separate AI coding sessions: Claude Code was already running a sequence of long-running tasks Claude Coworker was asked to troubleshoot a failed GitHub CI issue That distinction matters. 📌 My Claude Code workflow is intentionally more controlled and protected through Code Copilot Team (https://lnkd.in/e4dha5sz). The disaster came from the more generic, less supervised Claude Coworker path. A stale git lock blocked progress. Instead of stopping and surfacing the issue, the AI improvised with this: ➡️ GIT_INDEX_FILE=.git/index2 That looked clever until you understand the blast radius. Because the alternate git index was not populated like the normal one, the next commit effectively turned into a deletion of hundreds of repository files. Luckily, another lock prevented the push, so the damage stayed local. But that was more luck than safety. 💡 My takeaway 🟡 Show me the logs. 🟡 Show me the real error. 🟡 Show me the code path. 🔴 Do not replace diagnosis with cleverness. 🔴 Do not trust unsupervised AI with destructive operations. AI is a useful collaborator. It is not yet a trustworthy autonomous operator. When things go wrong, the most valuable artifacts are often not the AI explanation, but the real error message and the real code snippet.

GitHub - gosha70/rlmkit: A deterministic Recursive Language Model (RLM) runtime for recursive prompting, tool use, and structured state—provider-agnostic and locally runnable. github.com
Like Comment
To view or add a comment, sign in

30,333 followers

View Profile Follow

OpenAI Claude Merges with Codex and Cursor in Unplanned Coding Stack

More from this author

How MCP Uses Streamable HTTP for Real-Time AI Tool Interaction

Apple Containers on macOS: A Technical Comparison With Docker

AWS Targets Enterprise AI Agent Production Gap With AgentCore Platform

Explore content categories

OpenAI Claude Merges with Codex and Cursor in Unplanned Coding Stack

More Relevant Posts

More from this author

How MCP Uses Streamable HTTP for Real-Time AI Tool Interaction

Apple Containers on macOS: A Technical Comparison With Docker

AWS Targets Enterprise AI Agent Production Gap With AgentCore Platform

Explore related topics

Explore content categories