Every developer on our team was teaching Claude the same things. Independently.

Every developer on our team was teaching Claude the same things. Independently.

When AI coding assistants became mainstream, we did what most engineering teams do. We told developers: "Claude is available, go use it." And they did. Enthusiastically.

But six months in, I noticed a pattern. Every time a new developer joined a project, they spent the first few days writing the same prompts as the person before them. "Here's how our Java testing conventions work." "Here's how we use the 4D framework." "Here's what our stack looks like." The AI was getting smarter per session. But none of that knowledge was leaving the room.

We had an organizational knowledge problem disguised as an AI adoption story.

The real cost wasn't time. It was inconsistency.

The hidden tax wasn't just the minutes spent re-explaining things to Claude. It was that ten developers were producing ten slightly different versions of Claude's behavior. One had a well-tuned prompt for Java test writing. Another had figured out how to make Claude coach them on AI fluency. A third had neither.

The quality gap between "Claude with good context" and "Claude without it" is significant, and we were letting that gap be determined by individual effort rather than shared infrastructure.

The question I started asking: what if Claude's knowledge of our codebase, our conventions, and our workflows was something we could package, version, review, and distribute, like any other piece of internal tooling?

Skills as a unit of AI knowledge

Claude Code has a mechanism called skills: markdown files that install directly into a project's CLAUDE.md, giving Claude persistent, structured context. A skill is not a prompt you type. It's an instruction layer that travels with the repository.

We decided to treat skills the way we treat shared libraries. You don't expect every developer to rewrite their own HTTP client. You shouldn't expect every developer to re-teach Claude your testing standards either.

So on March 2nd, 2026, we built an internal catalog of reusable Claude skills for our R&D team. The pitch was simple:

"Install battle-tested AI instructions that make Claude smarter about your stack, your workflows, and your codebase, without each developer reinventing the wheel."

What it looks like in practice

The catalog launched with two skills, and a third arrived two days later. They're deliberately different.

→ java-test-writing, contributed by Emmanuel D. , one of our senior engineers. 631 lines of structured, opinionated instructions: naming conventions, Mockito preferences, how to structure test classes. When a developer installs this skill, Claude immediately applies our actual testing standards, not generic Java advice.

→ 4d-coach, contributed by Imad ATTAR , one of our most recent hires. A coaching skill based on Anthropic's AI Fluency framework. When active, Claude surfaces the 4D coaching moment when it's useful. Its guiding principle: "Don't block. If the request is already excellent, say so and let it through." Don't get in the way of someone who doesn't need coaching.

Then, two days after launch, a third arrived.

→ update-gha-sha, contributed by Adrien Kantcheff through the standard PR process. It pins GitHub Action references to full commit SHAs instead of mutable tags, following GitHub's own security recommendations for supply-chain hardening. The motivation is concrete: a compromised action repository could silently replace a tagged release with malicious code. SHA pinning guarantees you run exactly the code you reviewed. The automated Claude review fired 37 seconds after the PR was opened. It passed 5 out of 6 criteria. It was reviewed and accepted the day after. The PR footer: "Generated with Claude Code."

Three skills across three domains: AI fluency coaching, Java testing conventions, CI/CD security. One week in.

Treating AI instructions like software

The part I'm most proud of isn't the skills themselves. It's the infrastructure we built around them.

A CLI installer. One command fetches a skill from GitHub, injects it into your project's CLAUDE.md between versioned delimiters, and registers it. Seven commands in total: list, info, install, profile, remove, update, upgrade. Cross-platform, Bash and PowerShell both ship with the repo.

A machine-readable catalog. Skills have names, descriptions, versions, tags, authors. It powers the CLI and will eventually power a web UI.

An automated review pipeline. Every new skill submitted as a PR gets reviewed by Claude before a human looks at it. The workflow checks for credentials, suspicious URLs, shell injection patterns, prompt injection attempts, and posts structured feedback directly on the PR. Claude reviewing Claude skills. The meta layer is intentional.

An external skills system. A full lifecycle for importing public GitHub skills: issue → validation → auto-scaffolded PR → review → merge → weekly upstream update check. The infrastructure is fully built. We haven't imported any external skills yet, but the day we find one worth bringing in, it takes minutes.

Slack notifications on every skill merge, so the team knows what's been added.

Two audiences, one system

Consumers (developers who browse and install skills) get a guide on how to discover, install, and stack skills. Yes, stack: skills are designed to be composable. A developer might run 4d-coach, java-test-writing, and a future context skill simultaneously.

Producers (developers who want to contribute) get specific guidance on writing quality skills. A few principles that shaped the quality bar:

  • A skill should have a clear, bounded purpose. If you find yourself writing instructions that cover multiple unrelated concerns, split them.
  • You are giving Claude direct instructions, not writing documentation. Use imperative voice.
  • Every skill you install increases the size of Claude's system context. Respect that shared space.
  • If 3+ developers would benefit from it across different projects, it's probably a good skill.
  • Your skill must not break when combined with others.

That last point is the hardest discipline to maintain. Skills that assume they're the only thing in context tend to over-instruct and break when stacked.

One day. 27 commits. Two engineers.

Everything above, the CLI, the catalog, the review workflow, the external skills system, the documentation, the skills, the Slack integration, was built and committed on a single day.

The repo was bootstrapped from a single detailed prompt: 204 lines specifying every file, its purpose, and its quality bar. Claude Code executed that prompt and produced the full foundation. The rest of the day was iteration, review, and extending the design.

We used the tool to build the tool for using the tool. At some point the recursion stops being surprising and starts being the point.

What we learned

The format is everything. A skill that dumps context without structure doesn't work. Structure tells Claude not just what to do, but when to apply it.

Skills travel with the team, not the developer. When a new engineer joins a project that already has the catalog set up, they get the team's collective AI knowledge from day one. That's the compounding value.

Security scanning matters even for text files. A SKILL.md is injected directly into Claude's context. A malicious or poorly-written skill is a prompt injection vector. We almost skipped the scan. We're glad we didn't.

This is not a finished project. That's the point.

The infrastructure took a day to build. What comes next takes months.

PR #1 arrived two days after launch, from someone who wasn't part of building the system. One data point, but an encouraging one. The real test is whether that continues without being asked.

We're also watching governance at scale. Right now, distribution is install.sh run per-project, manually. That works for a small team. It gets harder at 50+ developers across a dozen active repositories.

A door worth opening

We're currently on the Team plan and in conversations with Anthropic about Enterprise. I won't make claims about what we'll do. But there's one capability worth naming.

Claude Code supports a managed-settings.json file that enforces organization-wide policies that individual developers cannot override. Skills are about what Claude knows. Managed settings are about how Claude is allowed to behave. They're complementary, not competing.

If we go that route, nothing we've built gets thrown away. The skills format carries over. The managed settings layer sits on top of what already exists.

Whether we go that route is an open question. But it's a door worth knowing about.

If you're running an engineering team where multiple people use Claude, and you're not sharing AI context the way you share code, the question worth sitting with is:

How much collective knowledge is your team re-teaching from scratch every week?

We started answering that question a few days ago. Ask me again in six months.


This is such a great approach to scaling AI adoption in a team. We've seen similar patterns where developers drift in different directions with their prompts. Having a central 'source of truth' for conventions definitely helps in keeping the codebase consistent, especially when using tools like Claude Code.

Like
Reply

Honest question because I'm struggling with this myself: how do you handle conflicts when two skills give Claude contradictory instructions? The composability idea is elegant but in practice I keep running into context priority issues

To view or add a comment, sign in

More articles by Pablo Alonso de Linaje

Others also viewed

Explore content categories