Bootstrapping a Multi-Agent Coding Project: From Empty Directory to Working System

Bootstrapping a Multi-Agent Coding Project: From Empty Directory to Working System

Multi-Agent Coding | LinkedIn

Multi-Agent Coding as a Constrained Optimization Problem | LinkedIn

Synchronizing Multi-Agent Coding Systems: The Clock Problem | LinkedIn

The Cold Start Problem

Setting up a multi-agent coding project presents a bootstrapping challenge that single-agent workflows do not face. In a single-agent session, you open a terminal, describe the task, and begin. In a multi-agent system, multiple agents must be initialized, given their roles, connected through a communication protocol, and sequenced so that agents with upstream dependencies do not start working before their inputs exist. The system must also be configured for safety — agents with write access to a shared codebase can interfere with each other or damage source material if permissions are not established before work begins.

This article describes a concrete procedure for bootstrapping a multi-agent coding project, from an empty directory to a functioning system with agents actively producing, testing, and committing code. The approach is grounded in a file-based communication protocol, a strict startup sequence, and layered safety mechanisms.

Prerequisites: What Exists Before the Agents Start

Before any agent session is opened, the project directory must contain one thing: the ground truth. This is a dedicated folder — ground_truth/ — containing all source material that defines what the project should produce. In a rewrite project, this includes the original implementation and its documentation. In a new project, it might contain papers, pseudocode, equations, draft specifications, or design documents. The ground truth folder is the single repository for all external context that the agents will draw upon. It is treated as read-only for the entire duration of the project.

No agent prompts need to be prepared in advance. The Coordinator will generate them during the setup phase. No safety configuration needs to exist yet — that too is established during bootstrapping.

The Communication Protocol

Before describing the startup sequence, it is necessary to understand how agents communicate, since the entire bootstrapping process operates through this protocol.

All agents communicate through markdown files in an agents/ directory. Each agent maintains two files:

status.md contains the agent's current state: what it is working on, what it has produced, what is blocking it, and any questions for the Coordinator. This file is written by the agent and read by the Coordinator and, when relevant, by other agents.

handoff.md is the agent's persistent memory. It contains everything the agent would need to resume its work from zero context: completed work with file paths, in-progress tasks with exact stopping points, decisions made with rationale, and queued work. This file exists to survive compaction — the lossy context summarization that occurs when an agent's context window fills. An agent that loses its context reads its handoff file and continues as if nothing happened.

The Coordinator's status.md has a special role: it contains directives for every other agent. Each agent reads the Coordinator's status file on every cycle to learn its current assignment. This creates a hub-and-spoke communication pattern — agents do not communicate directly with each other. All coordination flows through the Coordinator.

This protocol requires no framework, no message queue, and no orchestration layer. It is files on disk, read and written by agents on their own cycle. The entire project state is human-readable at any point.


The Startup Sequence

The order in which agents are started matters. Dependencies between agents mean that starting them simultaneously leads to agents waiting for inputs that do not yet exist, or worse, agents producing output based on incomplete upstream work.

The bootstrapping process has three stages: project definition with the Coordinator, context understanding with the Scientist, and then the development workflow with the remaining agents.

Stage 1: Project Definition with the Coordinator

The Coordinator is always the first agent started. Unlike all other agents, the Coordinator receives a comprehensive system prompt that defines its authority, the communication protocol, and the project standards it will enforce.

The Coordinator's initial task is not to read the ground truth — that is the Scientist's responsibility. The Coordinator operates at the project management level: it understands the goals, scope, phases, and success criteria of the project, but it does not need to understand the domain-specific content of the source material. A Coordinator managing a quantum chemistry codebase does not need to understand quantum chemistry. It needs to understand the project structure, the agent roles, and the workflow that will produce a correct, tested, documented codebase.

The first phase of bootstrapping is therefore a conversation between the human and the Coordinator. This conversation establishes:

Project scope and goals. What is being built, what does success look like, what are the constraints.

Phase plan. How the project will be decomposed into sequential, testable increments. For a scientific codebase, this might be: Phase 1 implements core equations and produces correct results; Phase 2 optimizes for performance; Phase 3 scales for HPC.

Agent roles. The Coordinator produces the .md files that will define each agent's identity, responsibilities, and file permissions. These files are the agent prompts — they are generated during this stage, not prepared in advance. The human reviews them, provides feedback, and iterates with the Coordinator until the role definitions are satisfactory.

Policies. Coding standards, testing requirements (minimum coverage thresholds, TDD protocol), commit policies, documentation conventions.

This stage involves back-and-forth iteration between the human and the Coordinator. It is the most interactive phase of the entire project. The investment here pays off throughout the remaining stages, because every subsequent decision the Coordinator makes is grounded in the agreements established during this conversation.

Stage 2: Permission Configuration

Before any other agent is started, the permission model must be established. This is a critical step that is easy to defer and dangerous to skip.

The Coordinator and the human define which operations each agent is permitted to perform. The key principle is that dangerous commands should be available only to the Coordinator. Other agents that need destructive operations — file deletion, file moves, arbitrary code execution — request them through the Coordinator, which evaluates the request and, if warranted, either executes the action itself or escalates to the human for approval.

The permission model is implemented through the .claude/settings.json file, which defines three tiers: deny rules that block operations unconditionally (such as any writes to the ground_truth/ folder), ask rules that require human approval before execution (destructive commands like rm, mv, chmod, and arbitrary code execution via python -c), and allow rules that grant autonomous access to safe operations (reading files, running tests, git commands, standard development tools).

A CLAUDE.md file in the project root reinforces the ground truth protection at the prompt level, and the ground_truth/ directory is made read-only at the operating system level (chmod -R a-w) as a final safeguard.

Stage 3: Start the Recorder

With the project defined and permissions configured, the first worker agent is started. This is the Recorder — the agent responsible for all git operations.

The human opens a new session, and the initial prompt is minimal: "You are the Recorder agent. Read agents/recorder/status.md for your role definition and current assignment." One or two sentences. The Recorder reads its role file (generated by the Coordinator in Stage 1), reads the Coordinator's directives, and initializes the repository: git init, .gitignore creation, initial directory scaffolding, and the first commit. It reports completion through its status.md.

The repository must exist before any other agent produces output, because all work products will eventually be committed.

Stage 4: Start the Scientist

The Scientist is the domain expert. It is the only agent that reads and interprets the contents of the ground_truth/ folder — the papers, algorithms, old implementations, equations, and specifications.

Again, the initial prompt is minimal: "You are the Scientist agent. Read agents/scientist/status.md for your role definition and current assignment." The Scientist reads its role file and begins its primary task: understanding the context. It reads the ground truth material thoroughly, produces a specification document (module inventory, dependency graph, API surface, critical numerical values, known behaviors), and proposes an implementation plan.

The Scientist is one of only two agents permitted to communicate directly with the human. If it encounters ambiguity in the source material — an equation that seems inconsistent, a design decision whose rationale is unclear, a piece of legacy code whose intent is uncertain — it asks the human directly. This is the domain clarification channel. The Coordinator handles project questions; the Scientist handles scientific questions.

When the Scientist completes its analysis, the Coordinator reviews the specification and implementation plan, and the project moves to active development.

Stage 5: Start the Developer and Tester

With the ground truth analyzed and the implementation plan established, the Developer and Tester are started. Both receive the same minimal initial prompt pattern: who they are and where to find their role file.

The Developer's first task is to set up the development environment — project structure, pyproject.toml, dependencies, package scaffolding. Once the environment is ready, it begins implementing the algorithm as specified by the Scientist.

The Tester writes tests before the Developer writes source code, following a test-driven development protocol. It reads the Scientist's specification, writes tests against the defined API and expected behaviors, and reports results.

At this stage, three worker agents are active: Scientist (available for domain questions), Developer, and Tester. Together with the Coordinator, this is four concurrent agents — within the practical token budget ceiling.

Stage 6: Start Additional Agents as Needed

The remaining agents — Reviewer, Documenter, Plotter, Reporter — are not started at project initialization. They are started when the Coordinator determines they are needed. When the first module is complete and tested, the Coordinator asks the human to start a Reviewer session. When documentation tasks accumulate, the Documenter is brought online. When numerical output exists, the Plotter and Reporter activate.

The Coordinator requests these activations through its status file or by directly asking the human. Each new agent receives the same minimal prompt and finds its role definition waiting in its .md file.


The Human's Role

The human monitors exactly two sessions: the Coordinator and the Scientist. This is a deliberate design decision.

The Coordinator may have questions about project goals, policies, phase transitions, or architectural decisions that require human judgment. The Scientist may have questions about equations, physical models, algorithmic intent, or the meaning of legacy code. These are the two channels through which the project requires human input.

All other agents communicate exclusively through the Coordinator. If the Developer encounters an ambiguity in the specification, it writes the question to its status file. The Coordinator reads it and either resolves it from project context, forwards it to the Scientist for domain clarification, or — if neither can resolve it — escalates to the human. The human never needs to read the Developer's, Tester's, or Reviewer's status files directly.

This model minimizes human involvement without eliminating human authority. The human retains full control through the Coordinator, but is not burdened with monitoring seven or eight agent sessions. The goal of the multi-agent workflow is to reduce human bottleneck in the development process, and requiring the human to continuously monitor markdown files would reintroduce exactly the bottleneck the system is designed to eliminate.


The Development Cycle

Once bootstrapping is complete and agents are active, the system operates in a repeating cycle. Each agent, on each cycle:

  1. Reads its own handoff.md to restore context.
  2. Reads the Coordinator's status.md to learn its current assignment.
  3. Performs its assigned work.
  4. Updates its own status.md with output, results, and any blockers.
  5. Updates its own handoff.md with everything it learned or decided.

The Coordinator's cycle is broader: it reads all agents' status files, assesses progress, detects blockers or repeated failures, makes decisions, updates its directives, and records its reasoning in its own handoff file.

The Commit Chain

A specific cycle deserves detailed attention: the path from completed code to a committed, version-controlled artifact.

The Developer completes a module and updates its status to report completion. The Coordinator reads this, verifies that the Tester has reported passing tests for that module, and then activates the Reviewer. The Reviewer inspects the code — quality, consistency, security, adherence to project standards — and writes its findings to its status file. If issues are found, the Coordinator directs the Developer to address them, and the cycle repeats. Once the Reviewer approves, the Coordinator directs the Recorder to commit.

The Recorder's commit gate is therefore: tests pass AND Reviewer approves. The Recorder never commits based on its own judgment. It commits when the Coordinator, having verified both conditions, gives the directive.

This chain — Developer → Tester → Reviewer → Recorder — mirrors a standard pull request workflow, with the Coordinator serving as the merge authority.

Compaction Resilience

The handoff file protocol described above is not optional. It is the mechanism that makes long-running multi-agent projects viable.

Compaction — the lossy summarization of older context — will occur during any session that runs long enough. When it occurs, the agent loses detailed reasoning, file-level understanding, and accumulated decisions. Without a handoff file, the agent after compaction may re-read files it already analyzed, reverse decisions it already made, or repeat work it already completed.

The handoff file prevents this by externalizing the agent's working memory to disk. The discipline is straightforward: update the handoff file after every action, before ending any cycle. The file must contain not just what was done, but why. A decision recorded as "we chose dataclasses" is useless after compaction. A decision recorded as "we chose dataclasses because the codebase has no validation requirements and dataclasses introduce zero dependencies" survives a memory wipe and prevents the post-compaction agent from revisiting the question.

Safety Configuration

Multi-agent systems amplify both productivity and risk. Multiple agents with write access to a shared filesystem can interfere with each other, and a single confused agent can delete files, overwrite source material, or commit untested code.

The safety model operates in three layers:

Layer 1: AI agent settings. The settings file defines three permission tiers. The deny tier blocks destructive operations on the ground truth folder — no writes, edits, or file manipulations targeting the source material. The ask tier places dangerous commands behind a human approval prompt. The allow tier grants autonomous access to safe operations: reading files, running tests, git commands, standard development tools. Dangerous commands are available only to the Coordinator; other agents that require such operations route the request through the Coordinator.

Layer 2: Sandbox restrictions. Operating system-level sandbox mechanisms enforce write restrictions at the kernel level, independent of what the agent attempts through its tools. On Linux, this is bubblewrap; on macOS, Seatbelt.

Layer 3: Operating system permissions. Making the ground truth folder read-only via chmod provides a final safeguard that no software-level configuration can override.

Beyond file protection, each agent's role file defines explicit boundaries: the Developer writes to project/src/ and nowhere else; the Tester writes to project/tests/ and nowhere else; the Reviewer reads everything but writes only to its own status files.

From Bootstrap to Production

The bootstrapping procedure described here — define the project with the Coordinator, configure permissions, initialize the repository, build context with the Scientist, activate development agents incrementally — establishes a system that can operate with minimal human intervention for extended periods.

The human monitors two sessions: the Coordinator for project-level questions and the Scientist for domain-level questions. All other agents are managed by the Coordinator and do not require human attention. The commit chain ensures that only tested, reviewed code enters the repository. The handoff files provide continuity across compaction events.

The result is not full automation. It is a structured collaboration between a human with domain authority and a team of specialized agents, each operating within defined boundaries, communicating through a simple protocol, and producing a codebase that is tested, reviewed, documented, and version-controlled from its first commit.


Seyyed Mehdi Hosseini Jenab — Senior Research Scientist at OTI Lumionics, working at the intersection of HPC, quantum chemistry, and AI-assisted scientific software development.

#AgenticAI #MultiAgent #SoftwareEngineering #ComputationalScience #TDD

Mehdi Jenab, setting up multi-agent projects is intricate but fascinating! each phase builds a strong foundation for success. great insights here!

what if we approached challenges as teamwork opportunities? collaboration sparks innovation.

To view or add a comment, sign in

More articles by Mehdi Jenab

Others also viewed

Explore content categories