Reinventing SDLC with an Agentic Harness - TCode Framework Evolution

How a portable SDLC framework — born from the real practice of building with AI — fills the gap that config files, memory tools, and orchestration frameworks each leave open.

{https://github.com/archgenai/tcode-framework}

---

The Conversation Has Arrived

Something crystallized in the first quarter of 2026: the AI developer community stopped arguing about which model is best and started talking about what sits around the model.

Birgitta Böckeler's April 2026 piece on [martinfowler.com](https://martinfowler.com/articles/harness-engineering.html) gave it a name: harness engineering. Philipp Schmid at Google DeepMind put the OS analogy in circulation: Model = CPU, Context Window = RAM, Agent Harness = Operating System. Anthropic's own engineering blog published [a breakdown of long-running agent harness patterns](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents). Aakash Gupta called it flat out: "2025 was agents. 2026 is agent harnesses."

The canonical formula is now widely quoted:

                        Agent = Model + Harness        

The observations are agreed, completely. One such effort has been building — organically, from practice — almost in parallel. It is called TCode, and it sits in a specific gap that most of the current harness conversation has not yet addressed.---

Where TCode Agrees With the Other Agent-Harness Efforts

The diagnosis is shared. The community has correctly identified three failure modes that models alone cannot fix:

1. Session amnesia. Every new conversation starts from nothing. The agent forgets which database you chose, which patterns you banned, and why you explicitly rejected that approach three sessions ago. Böckeler calls config files "manual memory prosthetics." She is right.

2. SDLC position loss. Brett Luelling (March 2026) named this precisely: during long projects, agents lose track of where they are in the SDLC. They start gold-plating, drift from scope, propose architecture that conflicts with decisions already baked into the codebase.

3. Benchmark gap. Harness quality — not model quality — is increasingly what separates reliable AI-assisted development from expensive chaos. Schmid makes this argument well: the benchmarks collapse after 50+ tool calls. Better models don't fix harness problems.

All three are real. TCode addresses all three.

---

Where the Existing Approaches Stop Short

There are three categories of solution in the current landscape, each excellent at what it does:

Config-file harnesses (CLAUDE.md, AGENTS.md, .cursorrules) tell the agent how to behave in this session. They are static files maintained by developers. They do not self-update, do not have a session lifecycle protocol, and do not prescribe a ritual for the agent to read context at the start and write it back at the end. The agent follows instructions, but does not carry knowledge forward.

Memory layers (tools like Memorix, Engram, ContextPool, AWS AgentCore) give agents cross-session recall. These are genuinely useful, and they are closing the amnesia gap. But they are additive memory — they do not define workflow structure. They give the agent recall without obligation. There is no formal session end protocol, no three-tier hierarchy of stable facts vs. active goals vs. episodic log, no SDLC phase discipline.

Multi-agent orchestration frameworks (LangGraph, CrewAI, AutoGen) solve agent-to-agent coordination within a task. They are not building the cross-session developer memory that survives across months of development on a project. Devin's sessions are isolated VMs. None of these carry a concept of "task plan" or "workspace facts" as first-class persistent artifacts.

Each approach occupies part of the space. None of them defines the harness as a session lifecycle ritual.

---

TCode's Specific Thesis

A harness is not complete until it tells the agent what to do at the boundaries — before the work starts and after it ends.

TCode is built around three things that other approaches leave implicit:

1. A Three-Tier Memory Architecture

Not one memory layer. Three distinct layers with different properties:

Article content
Three-Tier Memory Architecture supporting SDLC in Agentic Software Development

Stable facts and active goals are different things. Conflating them into one memory store means the agent cannot distinguish "I am building a Python API" (stable) from "Phase 2 is in progress" (active). TCode separates them.

2. A Mandatory Session Lifecycle Protocol

The session bootstrap is not a suggestion. At every session start, the agent reads MEMORY.md, task_plan.md, and the most recent session log before touching any code. At every session end, it writes back.

This is the part that closes the loop. Config files set standards; the session protocol makes those standards survive. Without the end-of-session write, every session is still a tourist on arrival.

3. Decision Records as First-Class SDLC Output

decisions.md is append-only. Every non-obvious architectural choice — what was decided, why, which alternatives were rejected — is written as a permanent record. Future sessions consult it before re-solving problems that are already solved.

Article content
Recording Architectural Decisions Records

Without this, here is what happens:

- Session 3: Agent picks SQLite. Correct — fast local dev, no infra.

- Session 7: New agent instance. No memory. Proposes PostgreSQL. You explain again.

- Session 12: Suggests MongoDB. You explain it a third time.

- Session 18: Proposes an ORM pattern you rejected in Session 4 because of lazy-loading bugs.

Each re-derivation costs tokens and time, and risks getting a slightly different answer that breaks something baked in earlier. Idea is to start with a defined decision and explain or branch if requirement warrants a different solution.

With decisions.md, Session 7 opens: "I see from ADR-001 that SQLite was chosen for zero-infra local development. Applying that now." No re-explanation. No drift.

Over 20+ sessions, this realistically saves hours of re-explanation and hundreds of thousands of tokens. Over a workspace with multiple projects sharing workspace-level ADRs, the savings compound further.

---

What TCode Is Not Trying to Be

TCode does not compete with LangGraph, CrewAI, or AutoGen. Those solve multi-agent coordination. TCode solves developer-project-agent coherence across sessions.

TCode does not compete with Memorix or Engram. Those give agents memory. TCode gives the memory structure and the obligation to update it.

TCode is not a code generator. It does not generate boilerplate, scaffold APIs, or write components. It is a discipline layer — the scaffold that keeps an AI-assisted development workflow from becoming an expensive, fast-moving mess.

Böckeler's framework distinguishes guides (feedforward) from sensors (feedback). TCode's memory system is both: the session bootstrap is a guide (context injected at start), and the session end write is a sensor (the agent reflects on what changed). The decision log is a guide that grows from feedback over time.

---

Agent-Agnostic by Design

The AI coding agent landscape changes fast, like all things AI. Claude Code ships major features regularly. Cursor reinvents itself. OpenAI Codex competes with new capabilities. Gemini provides developers with new options.

A developer who builds their SDLC discipline into a specific agent's configuration format is betting on that agent's longevity. TCode bets on the framework, not the agent.

FRAMEWORK.md    ← Canonical SDLC spec. Agent-agnostic. Never changes 
                   when you switch agents.
CLAUDE.md       ← Claude Code adapter. Reads FRAMEWORK.md first.

.cursorrules    ← Cursor adapter. Same framework, different syntax.

AGENTS.md       ← Codex CLI adapter. Same framework again.        
Article content
TCode with Adapters

The adapter is intentionally thin. It translates the framework's universal intentions into whichever agent's syntax, but holds no business logic, no project state, no decisions. Those live in the framework and memory layers — where any agent can read them.

This is where TCode most clearly diverges from CLAUDE.md-centric workflows: the framework is not inside the adapter. The adapter reads the framework. Changing agents is a one-file swap. Everything else stays intact.

TCode Is Organic — and That Is the Point

TCode was not designed in a whiteboard session. It grew out of the real, daily practice of building software with AI agents across a multi-language portfolio: Python APIs, TypeScript web platforms, GPU kernels, cross-platform desktop apps, intelligence platforms, digital products.

Every rule in FRAMEWORK.md exists because something went wrong without it. Every memory file exists because an agent lost context at the worst moment. Every ADR pattern exists because a decision was re-derived — expensively — in a later session.

This is what makes it different from prescriptive frameworks.

It is proof that agentic software development can absorb a lot of individualism.

The framework is not a straitjacket; it is the minimum shared contract that keeps agents coherent across time. Your language choices, your domain conventions, your preferred patterns all layer on top without friction.

If your team writes Go instead of Python, TCode works. If your organization uses GitLab instead of GitHub, TCode works. If you swap Claude for Cursor mid-project, TCode works.

---

How a Project Starts: Kickoff Files, Not Forms

The most natural entry point is a kickoff file in Project_Kickoffs/. This is intentionally loose:

      - A PDF brief from a client
      - A markdown doc written on your phone during a commute
      - A voice-memo transcript
      - A one-page idea dump with bullet points and hand-wavy 
          architecture        

The format does not matter. The agent reads it, generates a phased REQUIREMENTS.md, and scaffolds the project structure. From that point it has everything it needs to start Phase 1.

The agent does not get to skip phases. Gold-plating is explicitly prohibited. Without this constraint, agents drift — adding authentication nobody asked for, building UI for API-only projects, abstracting code into flexibility that confuses the next session's context.

Article content
TCode Lifecycle: Stepwise progression from Specs to Production

## Where This Fits in the 2026 Conversation

Böckeler's guide/sensor taxonomy is the right mental model for understanding what a harness component does. Anthropic Engineering's Initializer + Coding Agent split is the right model for how to structure long-running work. The memory layer tools (Memorix, Engram, ContextPool) are filling the right gap in the toolchain.

TCode is not competing with any of these. It sits one level above them: a workflow discipline that tells the agent what to do at the session boundary, so that all the other pieces — config files, memory tools, orchestration — operate on a foundation of structured, consistently maintained context.

The harness conversation in 2026 has correctly identified the problem. TCode's thesis is that the solution requires a ritual, not just tooling.

---

TCode is a living framework. It will evolve as the agent landscape evolves. The core idea — separating the framework from the adapter, making the session boundary a first-class obligation, and treating decisions as compounding assets — will not.

Interested in the framework or the approach? Drop a comment below or connect directly.

---

#AgenticAI #AIEngineering #SoftwareDevelopment #ClaudeCode #DeveloperProductivity #AIML #HarnessEngineering

To view or add a comment, sign in

More articles by Malik M Khan

Others also viewed

Explore content categories