For the past two years, the AI-in-code narrative has been about creation: auto-complete, copilots, and agents that promise to ship apps in minutes. For the first time, the story is expanding to include repair. This week, Google DeepMind launched CodeMender, an autonomous AI agent that hunts down vulnerabilities, drafts patches, tests them, critiques itself, and submits fixes to open-source repos. In its early phase, CodeMender has upstreamed 72 security fixes - some in codebases spanning millions of lines, the kind of work that would take human teams months. In other words: we’re teaching machines not just to write, but to atone. Historically - by which I mean, like, last year - cybersecurity was a human sport: a contest of builders and breakers, patchers and penetrators. Now, both sides are automating. - Attackers fine-tune LLMs to find zero-days, turning them into exploit copilots. - Defenders deploy repair agents to find and fix them. The result is an arms race between autonomous systems, unfolding at speeds far beyond human review cycles. Imagine the future: bugs and fixes flying past each other in the night, too fast for any human to follow. Security as algorithmic speed chess. And that sets up the deeper question CodeMender raises: What happens when software starts fixing itself? If an AI can autonomously detect and patch vulnerabilities, we edge toward self-healing infrastructure. But autonomy introduces new fragilities: ▪️ Adversarial corruption. An attacker could poison the model’s feedback loop, tricking its “critique agents” into approving malicious code. The line between “defender” and “attack surface” is one bad update away. ▪️Human deskilling: Overreliance breeds amnesia: “It’s fine, CodeMender will fix it” is a dangerous cultural default. ▪️Accountability black holes: If an AI-generated patch breaks production or causes a breach, who holds the bag - the developer, the model, or Google? Your Chief Risk Officer wants to know. And yet, doing nothing isn’t safer. We are already drowning in insecure code - much of it written by humans on deadlines and LLMs on vibes. The attack surface has outgrown human capacity to defend it. CodeMender represents more than automated patching. It’s a prototype for reflexive software - systems that monitor and adapt their own health. It works 2 ways: → Reactively, patching known vulnerabilities before they’re exploited. → Proactively, refactoring brittle code to eliminate entire classes of vulnerabilities before they occur. That’s not just “AI for cybersecurity.” That’s AI as immune system - a distributed intelligence layer quietly testing, healing, and hardening the world’s codebase. Autonomy in generation led us to creation at scale. Autonomy in repair might just lead us to resilience at scale. In an age where more software is written by models than by people, self-healing becomes survival - the only way to keep the lights on in a digital world built faster than it can be understood.
Trends in Code Review and Bug Fixing
Explore top LinkedIn content from expert professionals.
Summary
Trends in code review and bug fixing reflect how new tools, especially AI-driven software, are reshaping how programmers find and fix errors in code. Code review is the process where developers check each other's work to spot bugs before software is released, while bug fixing is the act of correcting those errors. Recent advances show that AI is speeding up these tasks, but also introducing new challenges around code quality, security, and accountability.
- Prioritize human oversight: Always include human review and judgment in the workflow, since AI-generated fixes can miss subtle errors or introduce new bugs.
- Adopt continuous testing: Use automated tests and regular scanning to catch logic errors and vulnerabilities early, especially as AI speeds up code changes.
- Build review culture: Encourage developers to share knowledge and stay alert for issues in AI-generated code by maintaining strong team communication and clear review responsibilities.
-
-
Your AI agent just pushed 47 security patches. How many did you actually review? Google DeepMind launched CodeMender last month. OpenAI followed with Aardvark. Both promise to identify and fix vulnerabilities autonomously. There are key architectural differences between the two. CodeMender combines static analysis, fuzzing, SMT solvers, and LLM reasoning. It validates fixes through differential testing before any human sees them. DeepMind reports 72 accepted patches across open-source projects. Aardvark takes a different path. It's LLM-first. The agent threat-models your repo, scans commits, validates exploitability in a sandbox, then generates patches. OpenAI claims 92% recall on test repos and 10 disclosed CVEs. Both sound great until you think about what they're actually doing. These agents write code probabilistically. They generate fixes based on learned patterns, not deterministic logic. You get speed. You get coverage. But you also get vibe coding at scale. Anyone who's ever vibe-coded knows that new bugs often emerge, or previously fixed bugs often magically reappear when you use AI to fix errors in the code. And they aren't always obvious. It's subtle logic errors that pass your CI because the agent wrote tests that match its own flawed assumptions. It's the gap between "this looks right" and "this is provably right." Program analysis can verify properties. Fuzzing can stress edge cases. But an LLM? It's guessing with high confidence. CodeMender layers validation on top of generation. That's better. But both tools still rely on probabilistic code synthesis, and both require human review as the last line of defense. Humans can't keep pace with autonomous agents. Not at scale. You want deterministic verification for code that patches security vulnerabilities. Anything less adds more security debt to the pile. The question isn't whether these tools are useful. They are. The question is whether your organization has the testing rigor to catch what they miss. Do you trust probabilistic code generation to patch your production vulnerabilities? 👉 Follow for more AI and cybersecurity insights with the occasional rant #AIgovernance #cybersecurity #AppSec #VibeCoding
-
Pull requests. Over the years I've discussed this topic more times than I can count: on LinkedIn, in workshops, in coaching sessions, in heated debates over coffee and Slack threads. Each time, I found myself reaching for the same studies, the same quotes, the same data points scattered across dozens of bookmarks, articles, and conference notes. Deming on inspection. The Microsoft Research papers. The DORA data. Martin Fowler, Dave Farley, Dragan Stepanović, Charity Majors. It was time to put it all together in one place. So I did. My new article on Substack (link in the first comment) brings together peer-reviewed academic research, large-scale industry data from DORA (36,000+ professionals), and a growing practitioner consensus to make the case that asynchronous pull request workflows (code sitting in a queue for hours or days waiting for review) are an antipattern for private software teams. The evidence is consistent: code review catches far fewer bugs than we think (less than 15% of review comments relate to actual defects), the waiting time constitutes 86-99% of a change's total lead time, and the process incentivises exactly the large batches and context switching that destroy flow. The article traces the origin of pull requests back to open source, where they were designed to accept contributions from untrusted strangers. When teams of trusted colleagues adopt the same model, they import a trust assumption that doesn't match their reality. As Deming put it over 40 years ago: you cannot inspect quality into a product. You must build it in. The article proposes an alternative I call T*D: the union of Test-Driven Development, Trunk-Based Development, and Team-Focused Development (pairing and ensemble programming). Together, these practices build quality in at every stage rather than inspecting it out at the end. And the transition doesn't have to be a leap of faith: it's a gradual journey from optimising your PRs, to Ship/Show/Ask, to pairing, to full continuous integration with continuous review. Getting rid of Pull Requests is often mistaken for being reckless. The teams that ship the fastest, with the fewest defects, are not the ones with the most elaborate gating processes. They are the ones that invested in the practices, skills, and trust that make gating unnecessary. #PullRequests #CodeReviews #SoftwareEngineering #SoftwareDevelopment
-
𝐀𝐈 𝐖𝐫𝐢𝐭𝐞𝐬 𝐌𝐨𝐫𝐞 𝐂𝐨𝐝𝐞 – 𝐀𝐧𝐝 𝐌𝐨𝐫𝐞 𝐁𝐮𝐠𝐬: 𝐖𝐡𝐚𝐭 𝐭𝐡𝐞 𝐃𝐚𝐭𝐚 𝐀𝐜𝐭𝐮𝐚𝐥𝐥𝐲 𝐒𝐡𝐨𝐰𝐬 AI-generated code is accelerating software delivery but also shipping significantly more defects than human-written code, especially around logic, security, and performance. It shifts developer focus from typing code to reviewing, testing, and governing AI output. As teams rush to adopt AI coding assistants, a new #CodeRabbit report highlights a clear trade-off: more code and faster drafts, but also more issues, deeper security risks, and heavier review loads. 🔹𝐊𝐞𝐲 𝐟𝐢𝐧𝐝𝐢𝐧𝐠𝐬 👉 𝐈𝐬𝐬𝐮𝐞 𝐯𝐨𝐥𝐮𝐦𝐞 ▪AI-generated pull requests average 10.83 issues vs 6.45 for human PRs (around 1.7x more). ▪AI-authored PRs also include 1.4x more critical issues and 1.7x more major issues. 👉 𝐃𝐞𝐟𝐞𝐜𝐭 𝐜𝐚𝐭𝐞𝐠𝐨𝐫𝐢𝐞𝐬 ▪Logic and correctness errors appear about 1.75x more often in AI-generated code. ▪Code quality and maintainability issues are 1.64x higher, with readability problems increasing more than 3x in some analyses. 👉 𝐒𝐞𝐜𝐮𝐫𝐢𝐭𝐲 𝐫𝐢𝐬𝐤𝐬 ▪Security vulnerabilities rise roughly 1.5–1.57x in AI-generated code. ▪Common issues include improper password handling, insecure object references, XSS vulnerabilities, insecure deserialization. 👉 𝐏𝐞𝐫𝐟𝐨𝐫𝐦𝐚𝐧𝐜𝐞 𝐚𝐧𝐝 𝐫𝐞𝐥𝐢𝐚𝐛𝐢𝐥𝐢𝐭𝐲 ▪Performance-related issues are around 1.42x more common, including inefficient I/O and suboptimal resource usage. ▪These issues lengthen reviews and increase the chance that serious bugs slip into production. 👉 𝐖𝐡𝐞𝐫𝐞 𝐀𝐈 𝐡𝐞𝐥𝐩𝐬 ▪AI-generated code shows 1.76x fewer spelling errors and 1.32x fewer testability issues, improving surface-level polish. ▪AI dramatically increases output volume, shifting human effort toward review, risk assessment, and higher-order design. 🔹𝐓𝐚𝐤𝐞𝐚𝐰𝐚𝐲𝐬 ▪Treat AI as a force multiplier, not an autopilot: pair AI coding tools with strong code review culture, threat modeling, and CI/CD gates. ▪Invest in governance: enforce linters, formatters, security scanners, and explicit AI usage policies to catch AI-specific failure modes early. ▪Upskill teams: train developers to recognize typical AI mistakes in logic, security, and performance, and to design prompts that incorporate business rules and architectural constraints. AI coding tools are here to stay, but this research is a reminder that speed without guardrails quickly turns into risk. The competitive advantage will belong to teams that combine AI-assisted generation with disciplined practices, rigorous review, a security-first mindset from day one. 𝐒𝐨𝐮𝐫𝐜𝐞/𝐂𝐫𝐞𝐝𝐢𝐭: https://lnkd.in/g9ctpXDf https://lnkd.in/g7AUt2Kq #AI #AgenticAI #DigitalTransformation #GenerativeAI #GenAI #Innovation #ArtificialIntelligence #ML #ThoughtLeadership #NiteshRastogiInsights ---------------------------------------------------------------------- • Please 𝐋𝐢𝐤𝐞, 𝐒𝐡𝐚𝐫𝐞, 𝐂𝐨𝐦𝐦𝐞𝐧𝐭, 𝐒𝐚𝐯𝐞, 𝐅𝐨𝐥𝐥𝐨𝐰 https://lnkd.in/gUeJrb63
-
Line-by-line code review is finally dead. It's been a zombie for years - everyone's always hated doing it - but AI coding tools have exploded the number of PRs and made it finally impossible. Every org we work with bumps into this really quickly (see the attached chart). They start doing more AI coding and it breaks their processes because suddenly each dev is creating a few PRs each day but no-one wants to review them. (Also the PRs are often bigger but that's a topic for another day.) No-one signs up to be a coder because they love reading someone else's code, so no-one's motivated to spend half their day reviewing. So the review backlog grows and grows. This is a problem, but the solution isn't to shout at devs to review more. Rethink code review. The point of review is: 1) find bugs 2) make sure the code is consistent with the rest of the repo 3) check decisions with future implications 4) communicate the changes to other devs 5) have two sets of eyes so ISO27001/SOC2 auditors are happy (1) and (2) are now best done NOT by humans, but by automated tools including LLMs, probably multiple of them. (3) and (4) are important and should 100% be still done. (5) is still valid. Here's the coding/review process we see working best right now: * developer plans with the agent * developer reviews plan * agent implements * developer reviews all the code, especially the tests/acceptance criteria (sometimes get the AI to write the tests first to make this easier/parallelisable) * that review can either be locally or in a draft PR - best is usually in a PR because then CI can be running in parallel * the agent watches CI for failures and watches automated review feedback, triaging and fixing eagerly * then finally “Ready for Review” * only at this point does another developer act as reviewer, but they don’t review line-by-line, because that's been done by multiple agents * the developer needs to understand the goal and then review schema or infra changes, also review the new tests at least at principle level * the most important thing they need to review: decisions made in this PR which might have ramifications * what you’re looking for here is things that might have security implications, scalability implications, non-functional requirements implications, etc Your tools should be surfacing these decisions so that a developer can assess their implications. Yes, that's what we're building with Cadence - checking the code and the session log in parallel to understand and surface decisions - but use something else if that's better for you. But fundamentally it's time to rethink code review. Your devs will thank you for it!
-
Interesting experiment I’ve been running lately that’s paying off way more than I expected on a few projects. Whenever you notice a recurring “pattern” bug, the kind that keeps getting reintroduced in slightly different ways, don’t just fix it and move on. Turn that bug into a Claude Code skill whose only job is to watch for that exact failure mode showing up again. So the workflow is basically: 1. Bug shows up (again) 2. You identify the shape of it (“this happens when X and Y drift out of sync”) 3. You write a small Claude Code skill that reviews changes and flags anything that looks like that pattern being introduced 4. You run it as a backstop (PR review, pre-commit, CI… wherever it fits) A concrete example: we hit an issue where we had a mismatched validator between Convex and a Vite front end. Page goes blank with a super unhelpful failure mode. It’s the exact kind of bug that slips through because it’s not a “syntax error” or obvious type error, it’s a contract drift problem. Obviously, you can (and should) have Claude Code skills baked into the build process to keep things aligned in general. But what’s been awesome is creating these specific “bug-pattern” skills as an extra layer. It’s like taking the pain you just experienced and converting it into a permanent guardrail. The part I like most: it scales your learning. Every time your team hits one of these weird edge cases, you’re not just solving it, you’re encoding it. Over time you end up with this small library of “things that used to bite us” checks, and regressions drop off fast. It’s also a great way to stop relying on tribal knowledge like: “Oh yeah, don’t forget that one Convex validator thing or the UI will silently explode.” Now it’s automated. No heroics required. Curious if anyone else is doing this?
-
We’ve seen customers experience this pattern: teams ask an AI agent to fix a bug, and the agent refactors three helper functions, adds defensive null checks everywhere, and rewrites code that worked fine. The core problem is that devs and the agent aren't working with the same boundary between what to fix and what to leave alone. We built Kiro's bug-fixing workflow around something we call property-aware code evolution. Every bug fix has dual intent: fix the buggy behavior surgically, preserve everything else. But how does this work in practice? How does Kiro know which is which? Kiro first proposes a bug condition—the scenarios it believes trigger the bug—and the postcondition—what should happen instead if we didn’t have a bug. Based on this, Kiro creates two testable properties: the fix property, which checks if the fixed code works correctly on buggy inputs and the preservation property, which ensures behavior is preserved everywhere else. You can iterate with Kiro over both properties until you’re comfortable with the agent’s hypothesis. Once that’s in place, Kiro first tests both properties against the unfixed code. Fix-property tests should fail, reproducing the bug exactly where predicted. Preservation tests should pass, capturing baseline behavior for the non-buggy scenarios. After gathering these results on the unfixed code, Kiro applies a fix and retests both properties. If the fix worked, both kinds of property tests should now pass, letting us know that we fixed the bug without changing anything else. Because all this is backed by property-based tests, Kiro generates and tests hundreds of variations that cover many edge cases to narrow down the problem and test the fix comprehensively. This approach gives teams the confidence to let Kiro work more autonomously without sacrificing understanding of what it’s doing to solve the problem. Our team dives into property-aware code evolution in this blog. Learn how to use agents to fix complex bugs more reliably with Kiro ➡️ https://lnkd.in/gWZkBcVX
-
We ran a retrospective specifically on our PR review process. Asked the team one question: "What part of code review do you find least valuable?" Every single person said some version of the same thing: chasing down AI findings that turned out to be nothing. Or worse - spending an hour proving a real bug was real, only to fix it in 10 minutes. The time ratio was backwards. More time proving than fixing. We mapped it out. The pattern was consistent: AI flags issue → developer cannot reproduce → developer deprioritizes → finding sits → sometimes a bug ships. Nobody on the team was being lazy. The incentives were just wrong. If verification is expensive and uncertain, rational developers save it for when they have slack. Which is never. CodeAnt AI just launched Steps of Reproduction inside PRs. The finding comes with the trigger conditions, the execution path, the proof. Verification goes from "30-minute investigation" to "2-minute confirmation." The retrospective basically wrote the product roadmap for them. Every complaint pointed to the same gap. They closed it.
-
In the age of AI-native development, code review is growing to be the most important part of generative AI workflows. As models improve, the workload we hand off to agents is increasing, but this trend has us losing valuable context in our codebase. The real power of AI code review isn't just catching bugs faster and preserving institutional knowledge that would otherwise vanish. When an agent generates code, the review process becomes your team's opportunity to understand why decisions were made, what patterns are emerging, and how different parts of your system connect. Without this review step, you're essentially accepting black-box contributions. My approach to AI code review treats these interactions as knowledge transfer opportunities. Whether the code comes from your teammates or from agents, the review workflow ensures context flows both ways: the AI learns your standards, and your team learns what the AI is building. This is especially critical as we move toward more autonomous coding agents that can handle entire features. The shift from "AI writes code faster" to "AI helps us review and understand code at scale" might be the most important evolution in developer tooling right now. #AICodeReview #DeveloperExperience #OpenSource
-
Open source repos are giving us a preview of what's coming for enterprise teams, and most engineering leaders aren't watching. Jazzband - a Python collective that ran for over 10 years - shut down this year. The lead maintainer cited the unsustainable volume of AI-generated spam PRs as the primary driver. curl canceled its bug bounty program in January. Daniel Stenberg received 20 AI-generated security reports in the first 21 days of 2026. None identified an actual vulnerability! His response: the slop reports "hamper our will to live." The core problem isn't that AI writes bad code. It's throughput asymmetry. Coding agents can generate 5-6 pull requests per developer per day, but code review hasn't gotten any faster! At least not nearly as much. And code review is more important than ever. CodeRabbit analyzed 470 open-source PRs and found 1.7x more issues in AI-co-authored code than in human-written code. Agoda found experienced developers were actually 19% slower with AI tools, due to "comprehension debt" - understanding less of their own codebases over time as AI-generated code accumulates. The open source community absorbs this first because anyone can submit a PR. Enterprise teams have more control, but here's what they don't have: visibility. An agent-generated PR from someone who can't explain the code looks identical in your review queue to a carefully crafted change from a senior engineer. The teams getting this right have moved validation earlier in the process. Not more AI-assisted review on top of broken code - validation baked into the development loop itself. Stripe's internal harness does this automatically across 1,300 PRs per week. That infrastructure investment is what separates the teams that actually benefit from coding agents from the ones that end up burning out the engineers tasked with reviewing agent output. Trust me, I have seen the burnout firsthand.
Explore categories
- Hospitality & Tourism
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Healthcare
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Career
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development