Claude Code - Beyond Basic Usage: A Mental Model of the Deeper Stack

Rohit Sharma

Published May 3, 2026

I have been using Claude Code for a while now. And I can say - it is the most capable agentic coding tool I have worked with. The way it reads your codebase, reasons across files, runs commands, verifies its own work - when it clicks, it feels like a genuine shift in how software gets built.

It's not complex to start with - you install it, type claude, and it works. But it's complex to understand deeply.

The complexity shows up later, when you try to understand the full extension surface and how to wield it properly - beyond the basic usage.

The Extension Sprawl

Depending on how you count - the documentation exposes more than a dozen surfaces: CLAUDE.md, rules, memory, skills, commands, MCP, subagents, agent teams, hooks, plugins, channels, scheduled tasks, and more.

Individually - the documentation around these makes sense. But the moment I tried to hold the full picture - when do I use a skill vs a rule? How is a subagent different from an agent team? Where does a hook fit vs a skill? - the whole thing collapsed into a flat list of features that felt like they were competing with each other.

It took real time building with the tool before the confusion cleared. And what I realized was this: these are not twelve competing features. They are five layers solving five different problems. None of them are mandatory. You add them one at a time, only when a specific friction point shows up. The friction itself tells you which layer to reach for.

I am still learning this tool deeply. But the mental model I have built so far has been solid enough that I wanted to share it. This is not a documentation summary - it is how I actually think about the Claude Code extension stack after working with it.

First, what are you extending?

Understanding this prevents the "I need to configure everything before I can be productive" anxiety. You do not.

Claude Code is an agentic loop with built-in tools. You give it a task, it figures out how to do it using files, bash, search, and web access. That baseline handles most coding work out of the box. Everything below is the extension layer you add on top when specific friction shows up.

The five layers

After working through the docs and building with the tool, here is how I organize the extension stack in my head. Five layers, each one triggered by a different wall you hit.

Notice something about the cost column. It goes from "every turn" at the top to "zero" near the bottom. That is not accidental. The system is designed so the most expensive extensions are the ones you set up first and keep small, while the powerful features lower down cost almost nothing until the moment of need.

Let me walk through each layer - not what it is theoretically, but when you hit the wall that makes you need it, and what actually happens at runtime when it kicks in.

Layer 1: Always-on context

You start a new session. Claude does not know your build command is pnpm test. It does not know you use 2-space indentation. It does not know never to commit directly to main. You tell it. Next session, same conversation. Third time, you realize you need to write this down somewhere Claude will read every time.

That somewhere is CLAUDE.md. Plain markdown, lives at your project root or at ~/.claude/CLAUDE.md for personal preferences across all projects. Claude reads it at session start, walks up your directory tree collecting all CLAUDE.md files it finds.

But here is the problem I ran into. My CLAUDE.md grew to 400 lines. API conventions, deployment steps, database schema notes. Claude started following instructions less reliably because the context window was getting noisy. That is where .claude/rules/ comes in.

The practical takeaway for Layer 1: everything here pays context cost on every turn. CLAUDE.md content, unconditional rules, auto memory - they are all sitting in your context window the entire session. The path-scoped rules feature is the escape hatch. A rule with paths: ["src/api/**/*.ts"] only enters context when Claude opens a matching file. Your API conventions do not consume tokens when Claude is editing a Dockerfile. This is how you scale instructions without drowning the context.

Layer 2: On-demand knowledge

This is the layer I wish I understood earlier. I had been stuffing everything into CLAUDE.md - my deployment checklist, my API style guide, my database schema docs. The moment I understood the two-phase loading mechanism of skills, I moved half of it out and everything got more reliable.

A skill is a SKILL.md file in a directory under .claude/skills/. The directory name becomes a slash command. But the key behavior is this: at session start, only the skill's description (one line from its frontmatter) enters your context. The full content loads only when invoked - either by you typing /deploy or by Claude automatically recognizing a matching task.

MCP servers solve the other flavor of this wall. You keep copying data from Jira or your database into the conversation. MCP connects Claude to those systems directly.

One composition pattern I have found genuinely useful: MCP provides the connection to an external system, but a skill teaches Claude how to use it well. I connected an MCP server to our database. Queries were technically working but Claude was writing inefficient joins and hitting the wrong tables. Adding a skill with our data model and common query patterns fixed it. The MCP server gives Claude the ability. The skill gives Claude the judgment.

Layer 3: Delegation

This wall hit me when I asked Claude to research how authentication worked across a codebase before making changes. It read 40 files, ran extensive grep searches, produced pages of intermediate output. My conversation was buried. My context window was full. I compacted and lost context I actually needed.

The fix is subagents. Claude spawns a worker in its own isolated context window. The subagent reads those 40 files in its own space and returns only a summary to your main conversation. Your context stays clean.

Claude Code ships with three built-in subagents it uses automatically: Explore (fast Haiku model, read-only, for codebase search), Plan (research during plan mode), and General-purpose (all tools, for complex multi-step tasks). You can define your own as markdown files in .claude/agents/ or ~/.claude/agents/.

Agent teams solve a different problem. Your workers need to talk to each other, not just report to you. Each teammate is an independent Claude Code session. They share findings, challenge each other, coordinate via a shared task list.

One constraint worth knowing upfront: subagents cannot spawn other subagents. The nesting is one level deep by design. And agent teams are experimental, disabled by default. For most workflows today, subagents are the practical choice.

Layer 4: Deterministic automation

This one is different from everything above. Every other layer involves the LLM reasoning about what to do. Hooks do not. A hook is a shell command that fires on a lifecycle event. After every file edit. Before a bash command executes. When Claude sends a notification.

The wall I hit: I added "always run prettier after editing files" to my CLAUDE.md. Claude followed it most of the time. But "most of the time" is not "every time." Some edits slipped through unformatted. The moment you need guaranteed behavior regardless of what the LLM decides, you need a hook.

The lifecycle has over 25 events covering every phase of a session - from SessionStart to SessionEnd, with granular events for every tool call, permission prompt, subagent spawn, and context compaction in between. The four above are the ones I reach for most in daily work.

Layer 5: Packaging

Brief because the concept is simple. You spent time configuring skills, hooks, subagents for a project. Your second repo needs the same setup. Plugins bundle all of it into a single installable directory with a .claude-plugin/plugin.json manifest. Skills get namespaced - /my-plugin:review - so multiple plugins coexist cleanly.

Use standalone .claude/ config while you are iterating. Convert to a plugin when reuse becomes the goal. That is the whole decision.

What loads when you type claude

This is the part that made everything click for me. Understanding the startup sequence turns the flat feature list into a layered architecture with clear cost tradeoffs.

The cost gradient is the architecture. Expensive always-on context at the top, kept small by design. Lightweight indexes in the middle that scale to dozens without cost. Full content that materializes only at the moment of need. And hooks at the bottom that cost nothing at all. This is why the system does not collapse under its own feature weight.

The cheat sheet

If you take one thing from this article, take this. Each friction point maps to exactly one feature. The friction is the signal. The feature is the response.

Claude Code is the most powerful agentic coding tool I have used. It is also one of the most layered. The sprawl is real - twelve extension types is a lot to hold in your head. But once you see them as five layers triggered by five different walls, the "what do I use when" question stops being confusing and starts being obvious.

I am still learning this tool. Every week I discover a pattern or a composition I had not thought of. But this five-layer mental model has been stable enough that everything new I learn snaps into one of these layers cleanly. That is how I know the model is directionally right.

Faisal Feroz 2h

Most teams adopt these features in the wrong order because they treat them as a checklist instead of a response to friction. In my own work I have seen skills earn their place only after the prompt sprawl becomes painful and subagents only after a single context gets overloaded with conflicting roles. A useful test before adopting any extension is to ask what production failure mode it actually prevents. If you cannot name one, you are adding tech debt dressed up as productivity.

Mark Ebert 22h

You can also modify the weights of the parameters to optimize how it responds.

1 Reaction

Tony L. 22h

6 months ago I would have agreed. Then codex took that mantle. Across the board, codex is just better, more reliable, and with 5.5 max, better value in every way

3 Reactions

Rohit Sharma 1d

For Ref, Previous article on .md sprawl: https://www.garudax.id/posts/rohit0221_agenticai-lessonsfromthetrenches-weekendairesearch-activity-7446100909681651712-zgsE?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAOdCLsBGlPwQUtH6Q9_U4AGb_cPlYQ9lXc

Jitin Kapila 1d

Would love to read more. But would suggest to try PI coding agent / Oh My Pi later a shot.

1 Reaction

See more comments

To view or add a comment, sign in

Claude Code - Beyond Basic Usage: A Mental Model of the Deeper Stack

Rohit Sharma

The Extension Sprawl

First, what are you extending?