Agent-Driven Development Needs Agent-Ready Standards
I believe process standards will need to be rethought for agent-driven development. Every organization that develops software follows some kind of process standard. Whether that is ISO 13485 in medical devices, ISO 26262 in automotive, SOC 2 in cloud services, or an internal quality framework, the assumption is the same: humans do the work, and the process is designed around how humans coordinate, review, and document. When AI agents take over a growing share of that work, the process standards do not become irrelevant. But many of the activities they prescribe change form in ways that the standards themselves do not yet account for. These are my thoughts on a topic that I believe needs to be addressed and for which, to my knowledge, no established answers exist yet.
In Post 3 — Risk Controls Agent Autonomy I proposed a systematic approach to defining agent autonomy at the component level. That addresses how agents work within a project. This post looks at the organization around it.
Process Standards Were Built for Human Tempo
Process standards govern how organizations develop products. In medical devices, ISO 13485 defines the quality management system. As of February 2026, the FDA has harmonized 21 CFR 820 with ISO 13485 through the new QMSR, with a small number of additional FDA-specific requirements. Other industries have their equivalents, from ISO 9001 as a general baseline to industry-specific frameworks in automotive, aerospace, and financial services. All of them share a common assumption: humans perform the work at human speed. Document control, design reviews, corrective actions, supplier management, and dozens of other processes are built around that assumption.
When an agent framework handles the bulk of development, many of these processes do not disappear. They change form. Document control becomes a property of the framework itself. Every artifact is versioned, traceable, and stored in structured formats. The framework does not need a separate document control procedure because it cannot produce uncontrolled documents. CAPA cycles, traditionally measured in weeks, compress into hours when review agents identify issues and remediation agents address them in the same pipeline run. Design reviews shift from scheduled meetings to continuous validation by specialized agents running after every change. Software supplier management looks different when code is a regenerable artifact and the relevant "supplier" is the agent framework and the model provider behind it.
What remains, and becomes more important, is the set of decisions that no agent should make. Defining the intended use of a product is a human responsibility. Determining acceptable risk is a human judgment. Clinical evaluation requires domain expertise that agents support but do not replace. Post-market surveillance and vigilance remain organizational obligations that require human accountability. The validation of the agent framework itself becomes a critical activity. How do you qualify the tool that builds everything else? This is a meta-question that current standards do not appear to address yet.
I believe a norm for an agent-driven organization would be significantly shorter than current process standards. Much of what they prescribe becomes inherent to the framework. What the norm would need to require is the qualification and ongoing validation of the agent framework, explicit definition of where human decisions are mandatory, traceability from risk assessment through to verification, and evidence that agents operated within their defined autonomy boundaries. The core of a future norm could reduce to three requirements. Qualify your framework, define the risk boundaries, and demonstrate that the output meets the specifications.
This leads to a thought that I find compelling. If the norm defines what an agent framework for product development must contain, then the norm itself could be published in an agent-readable format. A machine-readable standard that an agent framework can be validated against, and that a review agent can use to verify compliance. The framework description becomes the process documentation, and both development agents and review agents operate against the same machine-readable set of requirements. I am not suggesting this replaces human judgment in defining what standards should require. But once those requirements are defined, encoding them in a format that agents can process directly would close the loop between regulation, development, and review. That is a step further than where we are today, but the direction would be consistent with what is starting to happen on both sides.
When the Reviewer Is an Agent Too
This is not limited to the development side. The review side is changing as well. The FDA launched Project ELSA in 2025, a network of autonomous AI agents that analyze submissions, summarize adverse events, identify inspection targets, and accelerate clinical protocol reviews. By 2026, ELSA is being institutionalized across the agency. Dr. Andreas Purde from TÜV SÜD described in a recent Johner Institute podcast how notified bodies are adopting AI for their review processes, calling it an "epochal disruption." i-GENTIC AI launched context-aware MedTech agents in February 2026 that review 510(k) submissions for internal consistency, positioning them as a "24/7 digital FDA reviewer." On the compliance side, Zühlke and Confinis launch Beyond, a platform using agentic AI workflows for technology file verification, internal audits, and post-market surveillance.
When the reviewer on the other side is an agent, the submission needs to be agent-readable. A PDF with embedded tables and free-text descriptions is not what an AI review agent processes efficiently. Structured data, machine-readable schemas, and explicit traceability links are what review agents need to traverse and verify.
Three Levels for External Review
This is where my SVAD framework comes in. SVAD stands for Spec-Verified Agent Development, an agent-driven development framework where specifications are the durable assets, code is regenerable, and verification is performed by specialized AI agents against structured schemas. I have been building and refining it over the past months, and its architecture is designed to be ready for exactly this scenario. The framework operates on three levels that map to the two dimensions of regulatory review, process review and product review.
The first level is the framework description, written for agents. It explains how SVAD works, which agents exist and what their responsibilities are, which schemas define the artifact types, how the development process flows, and what quality gates are in place. This is not project documentation. It is a machine-readable explanation of the methodology itself, designed so that an external review agent can understand the rules before looking at any specific product. A human reviewer would ask their agent to explain it. The agent reads it directly.
The second level covers the project-specific configuration. How the framework was adapted for this particular product: which risk matrix applies, how autonomy levels were derived, which validators were activated, and any project-specific adjustments to the standard process. An external review agent reads this level to understand how the general framework was applied in this specific context.
Recommended by LinkedIn
The third level is the product data. Requirements, architecture, test cases, risk analysis, review findings, all stored in structured YAML schemas with explicit traceability links. The traceability graph connecting product features through requirements to architecture components and tests. The AI Support Capability attribute per component with its derivation from the risk matrix. Individual review findings as YAML artifacts with lifecycle tracking. The review system also documents positive observations, confirmed strengths that show not just what was found but also what was verified to work well. An external review agent can traverse this graph and verify consistency, completeness, and compliance.
The first two levels together serve the process review: did the team follow a valid, qualified development process? The third level serves the product review: does the product meet its specifications and regulatory requirements?
In practice, the entry point for an external review agent is a file I call process.md. It is a dedicated description of the development methodology, written specifically for agents, not a project file repurposed for this role. It tells the review agent where the framework description lives, where the project configuration lives, and where the product data lives. The review agent does not need to understand the full complexity of the product. It needs to know the structure, the rules, and where to look.
The convergence of agent-driven development and agent-driven review creates a new kind of interface between manufacturers and regulators. Both sides benefit from structured, machine-readable artifacts. Both sides benefit from explicit traceability. Both sides benefit from frameworks that encode their rules in formats that agents can process. I believe the manufacturers who build this infrastructure now will have a significant advantage when regulatory review becomes agent-driven at scale.
This is a transitional observation, like the AI Support Capability classification in Post 3. Today, submissions are still reviewed by humans with AI assistance. The shift toward fully agent-driven review will be gradual. But the direction seems clear to me, and the infrastructure requirements would be the same regardless of how fast the transition happens. Structured specifications, machine-readable schemas, and explicit traceability are valuable whether the reviewer is a human, an agent, or both.
There is a related question that I plan to explore separately. When an agent framework handles the bulk of execution, development teams change in composition, and the methodology changes with them. Two-week sprints and the rituals around them were designed for human coordination at human speed. Agent-driven development compresses timelines and shifts the work toward specification, risk decisions, and verification. That is a different angle from the process and regulatory perspective covered here, but equally important.
The next post returns to the main series: how eleven parallel AI validators implement something that looks a lot like the Swiss Cheese Model from aviation safety.
SVAD Series — Specification-Verified Agent Development
Part 1: Code Is a Disposable Artifact
Part 3: Risk Controls Agent Autonomy
Part 4: this article
Sources:
As development and review move toward structured, agent-readable systems, regulatory has to evolve from a narrative exercise and become something that can be traversed, validated, and stress-tested in the same way. That changes what “good” looks like. It’s no longer a well-written submission, instead it's a fully traceable argument from intended use through predicates, risk, and evidence. That’s exactly the direction platforms like AgentAstro.ai are pushing toward.
The hard part is proving why each risk decision was made. Agent-readable submissions need a shared evidence graph with stable IDs across requirements, hazards, tests, and change history, plus explicit rationale on what the agent was allowed to decide. If a sponsor agent and a reviewer agent can read the same package and still score risk differently, the workflow is not ready yet.
I don’t expect changes in the standards themselves, especially because they only specify the what, not the how. But even if so, we would need two separate process standards: one for humans, one for automation. The difference will be how automation and AI handle the requirements of the process standards, and that alone will make a huge difference. I agree with you regarding machine (and human) readable standards. I just joined a workshop by DIN Media (ex Beuth). They already offer standards formatted as ReqIF (XML).