From AI Coding Chaos to Clean Code: A Spec-First Case Study
Spec-First Development

From AI Coding Chaos to Clean Code: A Spec-First Case Study

A practical case study of spec-first development in AI-assisted software projects

The Problem: AI-Coding Chaos

This is a common scenario in AI-assisted development. You begin implementing a feature with AI tools, generate code, then realize the AI made assumptions about your data models, business logic, or API contracts. Suddenly you're in an endless cycle of regenerating code, fixing inconsistencies, and refactoring what should have been correct from the start.

I experienced this firsthand when I encountered a codebase with a large volume of AI-generated code. The inconsistencies and hidden assumptions made it clear that the approach had to change fundamentally.

The root cause: Starting with AI-generated implementation instead of clear specifications that the AI can follow. This is a common pitfall in AI-assisted development where the model's training data and assumptions override your specific business context.

I recently developed a small Anki LLM Assistant project to study spec-first development principles. Even for a modest codebase, this project shows why spec-first development creates significant value.

What is Spec-First Development?

Spec-first development involves:

  1. Writing specifications first (schemas, invariants, acceptance criteria)
  2. Providing these specifications to AI tools (ensuring consistent, accurate code generation)
  3. Implementing against those specifications (no hidden logic, no AI assumptions)
  4. Generating boilerplate from specifications (DTOs, validators, tests)
  5. Validating everything against the original specifications

This approach is analogous to construction: blueprints are required before foundation work begins.

A Simple Example: Anki LLM Assistant

This project is a simple AI assistant that enables users to browse their Anki flashcards through natural language. It's a straightforward application of the LLM ReAct pattern with LangChain. You can explore the complete codebase at https://github.com/lunochkin/anki-llm-assistant.

Quick ReAct Overview: The LLM acts as a reasoning engine that can use tools to answer user queries. When a user asks "Show me my French vocabulary cards," the LLM decides which tools to call, executes them, and then formulates a natural language response. This pattern is particularly effective for AI agents that need to interact with external systems or databases, as it provides a clear separation between reasoning and data retrieval.

The development workflow combines rapid AI prototyping to create a proof-of-concept, then spec-first development to solidify and make it production-ready.

1. Data Schemas (Tier-1)

# specs/schemas/deck.schema.yaml
type: object
properties:
  name: { type: string, description: "Deck name" }
  note_count: { type: integer, minimum: 0 }
required: ["name", "note_count"]        

2. Business Rules as Invariants

# specs/invariants.yaml
inv:
  INV-READ-1: { desc: "Never show more than N decks.", N: 10 }
  INV-READ-2: { desc: "Never show more than M cards per deck.", M: 10 }        

3. Tool Specifications

# specs/tools.yaml
tools:
  anki_list_decks:
    purpose: "Retrieve available Anki decks with metadata"
    input: { schema: "deck_list_input" }        

The Three-Tier Architecture

This project implements a strict separation of concerns:

  1. Tier-1 (Specifications): /specs/ - Schemas, invariants, configuration
  2. Tier-2 (Core Logic): /src/core/ - Business logic that implements specifications
  3. Tier-3 (Generated Code): /src/gen/ - Boilerplate generated from Tier-2

How It Works in Practice

The code directly references spec rules, ensuring no hidden logic:

# src/core/validators/invariant_checker.py
class InvariantChecker:
    def ensure_deck_limit(self, decks: list):
        # Business rule: INV-READ-1
        if len(decks) > 10:
            raise InvariantViolation("INV-READ-1: Never show more than 10 decks")        

Key observation: Every piece of business logic originates from a specification.

The Benefits Discovered

1. Crystal Clear Requirements

When the card limit needed to be modified from 10 to 15, only invariants.yaml required updating. The code automatically enforced the new rule.

2. Self-Documenting Code

Every business rule is explicit. New developers can read the specifications and understand exactly what the system does.

3. Bulletproof Testing

The golden tests validate against the specifications, not implementation details:

# tests/test_golden.py
deck_result = deps.decks_tool.list_decks(DeckListInput(limit=10))
assert len(deck_result.decks) <= 10  # Enforces INV-READ-1        

4. Easier Debugging

With AI code generation, debugging becomes much easier - instead of trying to understand complex framework rules or hidden assumptions, you can simply check if the generated code matches your specifications.

Why This Approach Works

Before (AI-Code-First)

  • Business rules distributed throughout the codebase
  • AI makes assumptions about data models and business logic
  • Changes require searching through multiple files and regenerating AI code
  • Tests fail when implementation changes

After (Spec-First)

  • Single source of truth for all business logic
  • AI follows explicit specifications, eliminating assumptions
  • Changes are explicit and traceable
  • Tests validate against specifications, not implementation

Important note: You don't need to achieve 100% spec-driven logic from the beginning. Start with the core business rules and critical interfaces, then gradually increase the percentage of spec-driven code as you iterate. The goal is continuous improvement, not perfection from day one.

What About Kiro?

You might be thinking: "This sounds like Kiro!" And you're correct—Kiro is an excellent tool that actively supports spec-first development. In fact, its existence demonstrates that the industry is recognizing the value of this approach for AI-assisted development.

However, the purpose of this project was to investigate spec-first development in its fundamental form, to understand the core principles directly rather than through tooling. This approach provided deeper insight into the underlying concepts before evaluating tooling solutions.

Kiro is excellent, but it's also opinionated and may not suit every use case or team workflow. The principles are more important than the tools. Begin with what you have, then evolve. The key is ensuring your AI coding tools have clear specifications to follow.

The Results

Even for this small project, spec-first development created a noticeable difference:

  • Development was more focused because the requirements were clearly defined
  • Fewer bugs because business rules were explicit and validated
  • Easier to understand because specifications provided clear documentation
  • Safer to modify because changes could be validated against specifications

The Bottom Line

Spec-first development has been a valuable practice in software development for years, but AI coding tools have simplified and accelerated the approach significantly. While AI-assisted coding has brought significant productivity gains, it has also introduced new challenges. It's now more important than ever to ensure business logic is consistent and complete.

Instead of asking "How should I implement this?"

Begin by asking "What should this system do?"

With AI coding tools, clear specifications become your "prompts" that ensure consistent, accurate implementation. The code generation will follow naturally, and you'll have a system that's easier to understand, maintain, and evolve.

For larger projects, this approach becomes especially critical as the cost of inconsistent business logic and misaligned implementations grows exponentially with team size and system complexity.

Try It Yourself

Examine the Anki LLM Assistant project to observe spec-first development in action. It's a small, focused AI-assisted codebase that demonstrates how these principles function in practice.

What's your experience with spec-first development? Have you implemented it in your projects? What challenges have you encountered? I'd appreciate hearing your thoughts and experiences in the comments.

Additional Resources

Spec-First Development Resources

  1. Spec-First Development: The Missing Manual for Building with AI
  2. The Rise of Spec-Driven Development
  3. Domain-Driven Design - Business logic as specifications
  4. ReAct: Synergizing Reasoning and Acting in Language Models

Why This Approach Matters

  • Maintainability: Code that follows explicit specifications is easier to maintain
  • Quality: Explicit requirements lead to fewer bugs and better testing
  • Collaboration: Specs provide a common language for developers and stakeholders
  • TDD Synergy: Works perfectly with Test-Driven Development - your specs become your test cases, and your tests validate your specifications

Industry Examples

  • API Development: OpenAPI/Swagger specifications drive consistent API design and client generation
  • Database Design: Schema-first approaches with tools like Prisma and Drizzle ensure type safety and data integrity
  • Modern Testing: Contract testing and API mocking based on specifications ensure reliable integration
  • AI/ML Systems: Model cards and data cards provide specifications for AI system behavior and data requirements

Ready to improve how you build software? Start with one small feature, write the specification first, and observe the difference it makes—even on modest projects.

What do you think about spec-first development? Have you tried it in your projects? Share your experiences in the comments below!

#SoftwareDevelopment #SpecFirst #CleanCode #Architecture #BestPractices #AI #OpenSource


Exactly, Maksim! I had a similar experience a few days ago when I started exploring background AI agents (like ChatGPT Codex). I outlined the planned functionality for my product, its architecture, purposes and even the mission and philosophy. But I haven’t tested it yet 🙂

To view or add a comment, sign in

Others also viewed

Explore content categories