Spec-driven development workflow - how to get production-ready code from AI

Sergii Makarevych

Published Mar 22, 2026

If Agentic Coders do not work for you as you want them to.
Or if you wonder how is it possible that people do not write code anymore while what you see is very far from being production ready code.
Or if you wonder how the code review should look like when it was written by AI.

Spec Driven Development Workflow might be the approach that can help you. I built a plugin for Claude Code which I would like to present, and also to share Spec Driven Development concept which I find very useful to improve Agentic Engineering quality.

After 1.5 years building fully autonomous AI agents, I've learned that once you've sorted out tooling and architecture, context is the only lever left. In the case of AI coders we manage only context — and that's exactly what SDD helps to improve.

Why agentic coders underperform

Let's name the three root causes of poor agentic coder performance:

Request is too short — when a prompt is too short and ambiguous, it creates room for LLM creativity which might not be in line with your expectations
Request is too hard — when your request is too complex to deliver in one shot, the model gets confused as it might lack the required tooling or context to deliver it
Context is too big and messy — when context grows, model performance drops (https://claude.com/blog/1m-context-ga)

For small and simple tasks in small codebases, running plan mode, verifying the plan, and writing the code with AI might be sufficient. This doesn't scale to harder tasks and bigger codebases. The idea of Spec-Driven Development is that we need to write a detailed spec for the agent which describes what we expect in all the details. Research supports this: detailed, executable specs reduce AI code errors by up to 50% (https://arxiv.org/abs/2602.00180) and security defects by 73% (https://arxiv.org/abs/2602.02584).

sddw: spec-driven development workflow for Claude Code

The idea of my plugin (sddw) is to use the agentic coder itself to help you write the spec, split the coding flow into multiple steps, and decompose a feature into atomic coding tasks.

If you've been building ML workflows, this should look familiar — think ARGO Workflows or Kubeflow. You have a task, which you split into atomic steps in a workflow, and you have artifacts which workflow steps can write and read. That's exactly what sddw does, just for agentic coding with Claude Code.

The workflow

sddw consists of 4 steps:

Write requirements — define user stories, functional requirements, acceptance criteria, constraints
Analyse existing codebase (optional) — extract patterns, interfaces, conventions from the target codebase
Design solution — decompose the feature into self-contained tasks with architecture, contracts, and data models
Implement tasks — implement one task at a time following TDD

After every step you clear the context or start a new fresh session.

After every step an artifact is created.

That artifact becomes the input to the next step.

This artifact-passing mechanism gives us modularity which otherwise is not possible with Claude Code today.

How It works in practice

The role of sddw is not only to code you a solution but to help you define a spec — how the solution should look like in the first place. sddw navigates you through a predefined flow, one step at a time:

Asks you a question that helps build the spec
Does research and analysis
Proposes a solution

The spec is the artifact for peer review, not AI-generated code

The spec contains valuable compressed information — a result of your inputs and AI research, approved by you. It has structure: acceptance criteria, TDD approach, architecture decisions with rationale, rejected alternatives. This is what you review and iterate on. The code that follows is a verified implementation of an already-approved spec.

There is not much point in reviewing AI-generated code on its own. If the review doesn't change the agentic coder's context — its instructions, skills, or system prompt — the same problems will appear in the next pull request. The coder doesn't learn from your comments.

Reviewing the spec is more valuable. The spec is the most important part of the model's context — it clarifies expectations and provides relevant information. It's also more interesting to review: it was written with human supervision and captures high-level design decisions, not implementation details.

How sddw addresses the root causes

Let's get back to the issues we defined:

Request is too short → the generated spec is bigger, better structured, and more thorough — it contains acceptance criteria, TDD approach, functional requirements, constraints
Request is too hard → two levels of decomposition make it simpler. We progress step by step through the workflow: defining requirements, analysing code, splitting the solution into tasks, implementing tasks one by one
Context is too big and messy → we clear context after every step. We keep only what's relevant for a single step or a single coding task

How sddw is built

I used commands, not skills, because commands are namespaced while skills are not. This allows using /sddw:requirements instead of just /requirements. Commands are thin entry points — they don't contain instructions inline but reference them via @ file includes. The structure of every command is identical:

Reference to instruction (process rules)
Reference to dialog flow (questionnaire)
Reference to output spec template
References to available input specs (delivered by previous steps)
Service metadata: name, description, previous step, next step

This keeps the structure modular and clear. It's easy to add or remove a component.

What's next

I think Claude Code would benefit from a workflows abstraction. sddw is a first step towards understanding how such workflows might look like. I'm exploring whether these abstractions — steps, specs, questionnaires, instructions — are sufficient to build agentic workflows in general:

Define steps in a workflow
Use agent assistance to build the required subcomponents for every step
Automatically wire references to previous artifacts and adjacent steps

Try it

Link to sddw: https://github.com/sermakarevich/sddw

I'm very curious about your experience with agentic coding — it's basically greenfield and everyone is figuring things out. Have you used or heard of SDD? If you tried sddw and find it useful, please consider commenting, sharing, or starring the repo.

If you are an organisation planning to use sddw, please consider sponsoring the project on GitHub.

Jaan Sokk 1mo

I've landed somewhere in between. I start with a lightweight spec, build the smallest working version, then iterate the spec alongside the code. Full upfront specs assume you know the problem well enough, which is rarely true at the start. The spec grows and improves as the feedback and features come in. Claude Code handles the iteration speed, you handle the judgment calls. So the input for any new feature is:- the current spec- the feature description, acceptance criteria.This goes through planning mode.Output: the implementation and the updated spec, where relevant.

1 Reaction

Sergii Makarevych 1mo

requirements -> code-analysis -> tasks -> implement

1 Reaction

Gaurav Bhowmick, PMP® 1mo

The architectural shift you are describing goes deeper than most people realize. The hard problem is not building the elicitation interface. It is teaching the agent when to ask. Most current human-in-the-loop implementations are either too conservative (interrupt on everything, defeating the purpose of automation) or too aggressive (only surface critical failures after they have propagated). The elicitation pattern essentially requires the agent to maintain a calibrated uncertainty estimate over its own decision boundary. It needs to distinguish between 'I am 70% confident and should proceed' versus 'I am 70% confident but the cost of being wrong here is catastrophic, so I should ask.' That is a fundamentally different capability than just generating good outputs. It requires the agent to reason about its own epistemic state relative to the stakes of the decision. The interesting follow-up question is whether this calibration can be learned from interaction history, or whether it needs to be architecturally specified per workflow.

Gabe Perez 1mo

waiting for the plugin demo - curious if the spec becomes the new "unit test" for code quality

See more comments

To view or add a comment, sign in

Spec-driven development workflow - how to get production-ready code from AI

Sergii Makarevych

Why agentic coders underperform

sddw: spec-driven development workflow for Claude Code

The workflow

How It works in practice

Recommended by LinkedIn

The spec is the artifact for peer review, not AI-generated code

How sddw addresses the root causes

How sddw is built

What's next

Try it

More articles by Sergii Makarevych

Others also viewed

Why vibe coding is making it big into PM roles?

Smarter, not just Faster: A responsible approach to AI-Driven development

Boosting Code Development with AI Tools: Lessons from the Trenches

VIBE CODING - How AI is Transforming the way we build Software

Why Your Engineering Teams Need to Rethink Code Review in the Age of AI

AI That Pays Off: (No, It's Not Coding)

From PRD to Prototype in Hours: How "Vibe Coding" is Redefining API Product Leadership

TrustYou AI DevEx: Why Spec-Driven Development Matters When AI Writes the Code

The Paradigm Shift - From Manual Coding to AI-Assisted Development

They say: "AI will replace developers." I say ...

Explore content categories

Why agentic coders underperform

sddw: spec-driven development workflow for Claude Code

The workflow

How It works in practice

Recommended by LinkedIn

The spec is the artifact for peer review, not AI-generated code

How sddw addresses the root causes

How sddw is built

What's next

Try it

More articles by Sergii Makarevych

Eight reasons AI code reviewers are mostly useless

Your CLAUDE.md files are silently bloating every AI session

AIKaggler, Part 2 — Notebooks, Classification and the shape of the knowledge base

A Config-Driven Way to Build Spec-Driven Workflows for Claude Code

AIKaggler, Part 1 — Turning Kaggle Writeups Into a Knowledge Base

Spec-Driven Development (SDD) — Course Summary

A New Way to Encode Documents for AI Agents: Navigable Trees Instead of Embeddings

How Claude Code's Memory Actually Works — Lessons from the Accidental Source Leak

Deep Analysis of Claude Code Leaked Source

Claude Code as a Self-Developing AI System

Others also viewed

Why vibe coding is making it big into PM roles?

Smarter, not just Faster: A responsible approach to AI-Driven development

Boosting Code Development with AI Tools: Lessons from the Trenches

VIBE CODING - How AI is Transforming the way we build Software

Why Your Engineering Teams Need to Rethink Code Review in the Age of AI

AI That Pays Off: (No, It's Not Coding)

From PRD to Prototype in Hours: How "Vibe Coding" is Redefining API Product Leadership

TrustYou AI DevEx: Why Spec-Driven Development Matters When AI Writes the Code

The Paradigm Shift - From Manual Coding to AI-Assisted Development

They say: "AI will replace developers." I say ...

Similar topics

How to Overcome AI-Driven Coding Challenges

Why Context Engineering Matters for AI Agents

Optimizing Context Windows in Agentic Loops

Explore content categories