Building an AI-native design system that is token-efficient with Claude Code + Figma Console MCP

Building an AI-native design system that is token-efficient with Claude Code + Figma Console MCP

Design systems have always been about constraint. You define tokens, components, and rules — and everything built within the system inherits coherence from that shared vocabulary.

What's changed is who is doing the building.

More and more, the entity consuming a design system isn't a human developer reading documentation. It's an AI agent interpreting structured data, making token lookups, and generating UI without ever asking a clarifying question. That shift demands a different kind of design system — one built not for human readability, but for machine interpretability.

So I decided to build one from scratch. Claude Code as the builder. Figma Console MCP as the bridge. The 6-layer constraint framework I published a few weeks ago as the blueprint.

This is what happened.


A quick anchor

If you're not familiar with the 6-layer framework, this article gives you the full picture. The short version: six layers, each with a specific job, organized so agents load only what they need and never guess. Foundation, context, governance, agent instructions, source of truth, enforcement.


Before anything else: why Figma Console MCP and not Figma MCP

This is worth saying directly before getting into the build.

Figma's official MCP is optimized for task-driven workflows — get design context, generate code, hand off. It's great for that specific use case.

Figma Console MCP by TJ Pitre is something different. 94+ tools, dedicated variable management, batch operations, and a WebSocket Desktop Bridge that gives the agent real-time canvas awareness. It knows what's selected, what changed, what page you're on — live.

For a system built around governance and versioning, that distinction matters enormously. The diff routine that runs at the start of every session — reading the Figma file, comparing against the registry, surfacing what changed — only works because Figma Console MCP gives the agent actual live canvas access, not a snapshot.

Figma MCP would have made that step manual. That's why Figma Console MCP won.


The file structure is the framework

The first decision was how to map the six layers to a real folder structure.

No abstraction. No indirection. The repo structure is the architecture.

Article content
├── CLAUDE.md              # Foundation layer — loads every session
├── skill.md               # Operational logic — loaded on demand
├── agents/                # Agent instruction layer
├── components/            # Component specs (JSON + MD per component)
│   ├── alert/
│   ├── avatar/
│   ├── badge/
│   ├── button-action/
│   ├── checkbox/
│   ├── icon-button/
│   ├── input-text/
│   ├── tag/
│   └── toggle/
├── context/               # Context layer — fetched on demand
├── governance/            # Governance layer — constraint files
├── enforcement/           # Lint and drift scripts
├── versioning/            # Version manifest and hashes
└── preview/               # Development artifact — not governance        

Every folder maps to a layer. The agent knows exactly where to look for what. No ambiguity about where the source of truth lives, where the constraints are defined, or which file loads first.


The Figma file follows the same logic

Five pages, each with a specific role, named with numeric prefixes so the order is explicit

Article content

The naming conventions frame on 00_readme is the first thing the agent reads when it accesses the Figma file via Figma Console MCP:

Components: kebab-case, semantic — e.g. card-product, button-action
Variants: prop=value format — e.g. size=sm|md|lg
Tokens: slash-separated by role — e.g. color/surface/primary
Pages: nn_name — e.g. 00_readme, 02_components
Files/Folders: kebab-case — e.g. button-action.json, button-action/        

The MCP ready badge in the bottom right is not decorative. It means the agent can read and write to the Figma file directly from Claude Code, in real time.

Figma is downstream from the token files. But Figma Console MCP makes the connection live — a token update in tokens.json can propagate to Figma variables. A component update in Figma can be diffed against the registry.

The sync direction is enforced in CLAUDE.md: Figma wins on conflicts. The canonical source is the code files.


The governance layer is the most important part

This is the layer most teams skip.

It also does the most work.

The governance layer doesn't tell the agent "please use the correct border radius." It gives the agent a file where the only border radius values that exist are the correct ones. The agent cannot hallucinate a value that isn't in the file.

Flat tokens, not nested

Every token is a flat key:

"color/accent/primary": { "light": "#0066FF", "dark": "#3385FF" },
"color/accent/hover":   { "light": "#0052CC", "dark": "#1A75FF" },
"color/accent/disabled":{ "light": "#B3CFFF", "dark": "#1A3A6B" }        
Article content

The path is the name. No nesting to traverse, no intermediate objects to misinterpret. An agent looking for color/accent/primary finds exactly one entry, with exactly two mode values.

Every category also carries annotation fields:

"_type": "color",
"_apply": "fill",
"_scale": "surface · text · border · accent · feedback · critical · neutral"        

The t-shirt size problem

sm, md, lg everywhere looks consistent. But it creates a dangerous assumption: that spacing/md and radius/md are related values.

They are not. spacing/md is 16px. radius/md is 6px.

To fix this, spacing tokens are also component-specific — not just a global scale:

Article content
spacing/md        → 16

spacing/button/padding-x-md  → 16
spacing/button/padding-y-md  → 10
spacing/input/padding-x-md   → 12        

The base scale gives the agent the hierarchy. The component-specific tokens give it the exact value for the exact context. And skill.md makes the rule explicit:

▎ spacing/md and radius/md are both "md" but have different values.
  Always resolve against governance/tokens.json.
  Never infer a value from the size label alone.        

Typography: styles are not variables

This was the sharpest decision in the whole token layer.

In Figma, variables (color, spacing, radius) and text styles are fundamentally different concepts. Variables use setBoundVariable. Text styles use setTextStyleIdAsync. If an agent treats them the same, everything breaks silently — the style doesn't apply, no error is thrown, the agent has no idea.

The fix was to embed the correct API pattern directly in the token metadata:

"typography/label/md": {
  "size": 14,
  "lineHeight": 20,
  "weight": 500,
  "role": "default interactive text, button labels, form field labels, tabs",
  "_not_a_variable": "Do NOT use setBoundVariable. Text styles are a separate Figma concept.",
  "_figmaAPI": "const styles = await figma.getLocalTextStylesAsync(); const s = styles.find(s => s.name === 'typography/label/md'); await node.setTextStyleIdAsync(s.id);"
}        

The agent reads tokens.json and gets the rule and the correct implementation in the same lookup. No separate documentation needed.

The type scale uses a two-weight system: Medium (500) as the base, Bold (700) with a -strong suffix for emphasis. The decision tree is simple — default: typography/label/md, emphasis: typography/label/md-strong. That's the whole model.


The component registry

Every component is a contract. Structured data that defines exactly what can be built, with what properties, using which tokens.

"button-action": {
  "props": {
    "variant": { "values": ["filled", "outline", "ghost", "critical"], "default": "filled" },
    "size":    { "values": ["sm", "md", "lg"], "default": "md" },
    "state":   { "values": ["default", "hover", "disabled"], "default": "default" }
  },
  "tokens": {
    "filled": {
      "default": { "bg": "color/accent/primary", "text": "color/accent/foreground", "stroke": null },
      "hover":   { "bg": "color/accent/hover",   "text": "color/accent/foreground", "stroke": null },
      "disabled":{ "bg": "color/accent/disabled","text": "color/accent/foreground", "stroke": null }
    }
  },
  "spacing": {
    "md": { "padding-x": "spacing/button/padding-x-md", "padding-y": "spacing/button/padding-y-md" }
  },
  "typography": { "md": "typography/label/md" },
  "radius": "radius/md"
}        

An agent building a filled button in hover state does one lookup: tokens["filled"]["hover"]. Three token paths. Nothing else. No color values to remember, no implicit rules about what hover means.


What came out of it

It was fast

Genuinely fast. Components, token wiring, Figma variable creation, the HTML preview — things that would take days manually were done in a single session. That part lived up to the promise.

The preview

A single-file HTML preview with no framework, no build step. Every color value comes directly from the token map as CSS custom properties. Nine components, roughly 120 distinct visual states, all on one page.

Article content

The purpose of this isn't aesthetics — it's falsifiability. If a token value is wrong, or a component spec has an inconsistency, it shows up here. The spec and the rendering have to agree.

The Figma components page

Article content

The patterns page

Article content

The components came out great. Consistent tokens, correct states, proper variant coverage.

But when I tested a built screen, icon inconsistencies showed up. Wrong sizes, mismatched strokes. That one is entirely on me — I never defined icon rules in the governance layer. The agent didn't invent anything wrong, it just didn't have anything to go on. The system worked exactly as designed: if a rule doesn't exist in the files, the agent can't apply it.

The patterns also have spacing issues. Structurally correct, but the spacing between elements feels off in places. Before fixing those manually, I want to define proper layout spacing rules in composition-rules.md first. Manually correcting agent output without updating the governance layer is how drift starts.


Versioning and the diff routine

Article content

Every session starts with a diff. CLAUDE.md instructs the agent to read the Figma file via Figma Console MCP, compare it against the component registry, and report what changed before touching anything.

The terminal shows this in practice — the version manifest bumps from 0.1.8 to 0.1.9, file hashes update, the agent logs exactly what was built and with which components.

{
  "version": "0.1.9",
  "figma_file": "FIGMA FILE KEY",
  "sync_direction": "figma-to-code",
  "hashes": {}
}        

The Figma file key is pinned in the manifest. The agent checks Figma first, every session. Silent drift is not possible.


The honest take

The 6-layer framework held. Folder structure maps cleanly. The governance layer does exactly what it's supposed to — the agent cannot invent values that don't exist in the files.

What was harder than expected: the typography/variable distinction. Every agent I tested wanted to use setBoundVariable on text styles at least once. Better prompting didn't fix it. Embedding the correct API call in the token metadata did.

What the icon inconsistencies taught me: gaps in the governance layer are documentation failures, not agent failures. The agent built exactly what the system described. If you want consistent icons, you have to define what consistent icons look like. That's still the designer's job.


What's next

The current setup is mostly one-directional. Code is the source of truth, Figma reflects it.

The next step is a true dual workflow — changes made in Figma propagate back to the code files, and changes in code propagate to Figma. The governance files stay as the source of truth for rules and constraints. Figma becomes the canvas for visual audit: where you look, refine, and verify, not where you define.

Once that bidirectional sync is solid, the plan is to bridge a legacy design system into this architecture. Take an existing system with all its accumulated inconsistencies, run it through the six layers, and see what comes out the other side.

That's the real stress test. More on that when I get there.

Very deep article, thanks for sharing. I’m exploring the most simple and efficient way to transfer classic design system connected to storybook into fully AI-native one. What do you think are the things that should be done as a must without overcompensating and things that better be avoided?

Like
Reply

I love this! Thanks for sharing, incredibly helpful.

Love treating governance and diffs as part of the workflow, not an afterthought. Bidirectional sync with Figma will be a game changer.

"Bridging legacy systems? Sounds like digital archeology, Alejandro! ⛏️ Fascinating.

To view or add a comment, sign in

More articles by Alejandro Bauer

Others also viewed

Explore content categories