⚠️ Addictive tech warning for developers. Once you add a 🦆rubber duck to your AI agent pipeline, you’ll start feeling uncomfortable without it. This is exactly what happened to me. I no longer want to rely on a single model’s opinion for important technical decisions, and I definitely don’t want extra manual steps just to get a second perspective. That’s where “Rubber Duck”, an experimental feature in the GitHub Copilot CLI, really worked for me: - enable it with: "copilot --experimental" (Rubber Duck is the 1000th reason for you to switch to terminal-first development) - watch one LLM actively criticise another’s decisions right at the moments where it matters most, pushing towards a better solution - everything happens automatically, no extra friction, no context switching It is a targeted reviewer that steps in at high-value moments such as after drafting a plan, after a complex implementation, and after writing tests before execution. That feels like a very practical way to reduce compounding errors early, especially in long-running or multi-file tasks. So having AI challenge AI has quietly become part of how I build now. Would you trust critical technical decisions to a single model, or is multi-model critique the new baseline for serious AI-assisted development? Ready to try Rubber Duck? I warned you :) More details: https://msft.it/6044Q4Zs2 Morten Stange Bye, Haakon Hasli, Christian Tryti, Else Tefre, Francesco Manni, Jaime De Mora, Martin Woodward, Lee Stott, Christoffer Noring, Daniel Meppiel, Joel Norman, Ömür Sert, Adil I., Sebastien Le Calvez, 🥑 Aaron Powell, Nick McKenna, Burke Holland, Cornelia Bjørke-Hill #GitHubCopilot #GitHubCopilotCLI #CopilotCLI #DeveloperTools #AIAgents #CopilotRubberDuck #msftadvocate
I would trust critical technical decisions to experienced senior developers, perhaps with the help of AI suggestions. AI makes so many simple, basic mistakes and opens so many loopholes and security question marks that I wouldn't let it make decisions. I would only let it suggest and trusted developers should then decide.
github copilot PRs are the best, it is a stickler and always throwing a smack down to the author (claude or codex) 😅
Brilliant use of the adversarial agent pattern! Having one model automatically critique another is exactly like a senior engineer reviewing a PR before it ever reaches production.
It appends every prompt for me now. No going back!
SWE @ Keysight Technologies | Bachelor Graduate @ ACS PUB
3wA good starting point, but I am curious if "rubber duck" pulls a lot of the codebase into its context before making any critiques to the coding model's solution. If the project is a complicated mess (as is the case most of the time in the real world), it is very probable that it won't catch the subtle issues, and it is just going to consume more tokens (because you have less of a human-in-the-loop and more of AI arguing with AI. I believe that this is a nice idea though, but for it to work properly, agents must be able to access a complete, compact and "digestible" database of understanding for the codebase. If you know of any existing solutions that actually work, on have any ideas on this, let me know, and thanks for the tip!