AD '26
All Chapters

CHAPTER 10

MCP Integrations

Connecting Figma, Miro, MagicPattern, and other tools via MCP

Reading time

38 min

MCP: The Universal Design Tool Connector

MCP is a standardized protocol that lets AI agents talk to external tools. Think of it as an authenticated API that an LLM can call during a conversation. Instead of copying design specs between tools by hand, the agent reads from one MCP server and writes to another. All in a single session.

MCP servers expose three things: tools (functions the agent can call), resources (data the agent can read), and prompts (guided workflows). For design work, tools are what matter most. A tool like get_jsx on the Paper MCP server is the difference between manually translating a design and having the agent do it in seconds.

Two transport types exist. HTTP (remote) connects to a URL like https://mcp.figma.com/mcp. stdio (local) spawns a process on your machine. Figma uses HTTP. Paper uses HTTP too, but the server runs locally inside the Paper Desktop app at http://127.0.0.1:29979/mcp. The transport type matters for configuration, which I cover next.

As of May 2026, MCP is supported by every major agent platform: Claude Code, Codex CLI, Cursor, VS Code Copilot, OpenCode, Windsurf, Gemini CLI, Warp, Kiro, Factory, Firebender, Replit, Amazon Q, and Android Studio. The ecosystem is mature enough that MCP support is table stakes for any agent tool.

Configuring MCP Servers in Each Agent Platform

Every agent platform has its own MCP configuration syntax. Some use CLI commands. Others use JSON config files. A few support both. Here is the complete reference for the three agents this book focuses on, plus VS Code and Cursor for teams using those editors.

VS Code split editor view showing three MCP configuration files side by side: Claude Code settings with Paper and Figma HTTP servers, OpenCode JSON config with remote type, and VS Code mcp.json with server definitions
MCP configuration across Claude Code, OpenCode, and VS Code showing different syntaxes

Claude Code

Claude Code supports two approaches: the mcp add CLI command and the plugin system. Plugins are preferred because they include skill files that teach the agent how to use the MCP tools effectively.

# Plugin approach (preferred for Figma)
$ claude plugin install figma@claude-plugins-official

# Plugin approach for Paper
$ claude plugin install paper-desktop@paper

# Manual remote server (Figma)
$ claude mcp add --scope user --transport http figma https://mcp.figma.com/mcp

# Manual local server (Paper via stdio bridge)
$ claude mcp add --transport stdio paper -- npx mcp-remote http://127.0.0.1:29979/mcp

# Verify: type /mcp in Claude Code
# You should see figma and paper listed
# Authenticate when prompted

The --scope user flag makes the MCP server available across all projects. Without it, the server is scoped to the current project only. For design tools you use daily, user-scoped is the right default.

Codex CLI

Codex uses a simpler configuration model. The mcp add command takes a name and URL. For tools that need plugins (like Figma), use the app settings UI.

# CLI: add Figma remote server
$ codex mcp add figma --url https://mcp.figma.com/mcp

# App: Settings > Plugins > + Figma > Install > Authenticate

# App: Settings > MCP Servers > Streamable HTTP
# Name: paper, URL: http://127.0.0.1:29979/mcp

OpenCode

OpenCode configures MCP servers in opencode.json. The configuration is declarative JSON, which makes it versionable and shareable across a team.

// opencode.json
{
  "mcp": {
    "paper": {
      "type": "remote",
      "url": "http://127.0.0.1:29979/mcp",
      "enabled": true
    },
    "figma": {
      "type": "remote",
      "url": "https://mcp.figma.com/mcp",
      "enabled": true
    }
  }
}

VS Code and Cursor

VS Code and Cursor both use JSON config for MCP servers, placed in .vscode/mcp.json at the project root. Cursor additionally supports a plugin deep-link format.

// .vscode/mcp.json
{
  "servers": {
    "figma": {
      "url": "https://mcp.figma.com/mcp",
      "type": "http"
    },
    "paper": {
      "type": "http",
      "url": "http://127.0.0.1:29979/mcp"
    }
  }
}

For Cursor, the plugin approach is cleaner:

# Cursor plugins
/add-plugin figma
/add-plugin paper-desktop

# Cursor deep link for Figma (one-click setup)
cursor://anysphere.cursor-deeplink/mcp/install?name=Figma&config=...

The following table summarizes the configuration approach for each agent platform.

Platform Config Method Remote Servers Local Servers Plugin Support
Claude Code CLI + opencode.json mcp add --transport http mcp add --transport stdio Yes (claude plugin)
Codex CLI CLI + App Settings mcp add --url App Settings only Yes (Skills)
OpenCode opencode.json "type": "remote" "type": "local" Yes (Skills)
VS Code .vscode/mcp.json "type": "http" "type": "stdio" Yes (Extensions)
Cursor .vscode/mcp.json + plugins "type": "http" "type": "stdio" Yes (/add-plugin)

My take: Claude Code's plugin system is the most mature MCP configuration experience as of May 2026. The plugin includes both the server connection and the skill file that teaches the agent how to use the tools. With Codex and OpenCode, you get the connection but need to write your own skill context. This gap will close, but right now Claude Code has the edge for MCP-heavy workflows.

Figma MCP Server: Reading Designs Programmatically

Figma offers two MCP server types: Remote at https://mcp.figma.com/mcp (available to all plans) and Desktop (runs inside the Figma desktop app, available on Dev or Full seats for paid plans). The remote server is the one to use. It supports write-to-canvas, code-to-canvas, and diagram generation. The desktop server is limited to read operations for most users.

Figma design editor showing a product design file with the canvas, layers panel, and an agent terminal reading design node data through the Figma MCP server including component names and properties
Figma design file with agent reading design data through the Figma MCP server connection

Authentication is OAuth-based. When you first connect, a browser window opens for the Figma sign-in flow. After that, the token is cached. No API keys to manage.

Figma's MCP server exposes 18 tools. Here is the complete reference.

Tool Remote Desktop Description
get_design_context Yes Yes React+Tailwind code from selection (or custom framework)
get_screenshot Yes Yes Screenshot of current selection
get_variable_defs Yes Yes Variables and styles used in selection
get_metadata Yes Yes Sparse XML of selection (IDs, names, types, positions, sizes)
get_figjam Yes Yes FigJam diagrams as XML with screenshots
get_code_connect_map Yes Yes Figma node to code component mappings
use_figma Yes No General-purpose write to canvas (create, edit, delete)
create_new_file Yes No Create new Figma Design or FigJam file
generate_diagram Yes No Generate FigJam diagram from Mermaid or natural language
generate_figma_design Yes No Send live UI to Figma as design layers (code to canvas)
search_design_system Yes No Search libraries for components, variables, styles
whoami Yes No Authenticated user identity

The primary workflow is link-based. You select a layer in Figma, copy the URL from the browser address bar, and paste it into your agent prompt. The agent extracts the node ID and calls get_design_context. The tool returns React+Tailwind code by default, or you can specify a custom framework output.

Write-to-canvas capabilities are more ambitious. The agent can create, edit, and delete pages, frames, components, variants, variables, styles, text, and images. For FigJam, it can create stickies, sections, connectors, shapes, tables, and code blocks. The agent is supposed to check the design system before creating elements from scratch. In practice, this works best when paired with a skill file that enforces design system constraints.

Supported diagram types for generate_diagram: flowchart, Gantt chart, state diagram, sequence diagram, architecture diagram, and ERD. Useful for having an agent generate architecture documentation directly in FigJam.

Figma also provides built-in skills for the plugin system: Code Connect, Build/Update Screens from Design System, and Create New File. These skills teach the agent specific Figma workflows beyond raw tool access.

Client support varies by feature. The following table shows which agent platforms support which Figma MCP capabilities as of May 2026.

Client Desktop Remote Write to Canvas Code to Canvas Plugin/Skills
Claude Code Yes Yes Yes Yes Yes (Figma plugin)
Codex Yes Yes Yes Yes Yes (Codex Skills)
Cursor Yes Yes Yes Yes Yes (Figma plugin)
VS Code Copilot Yes Yes Yes Yes Yes (Figma plugin)
Gemini CLI Yes Yes No No Yes (Extension)
Warp Yes Yes Yes Yes No

Miro MCP Server: AI-Powered Whiteboarding

Miro has been developing MCP integrations that give agents access to boards. The server enables reading board content, creating shapes and stickies, and making connections between elements. Typical use cases include automated diagram generation, turning meeting notes into structured boards, and architecture documentation.

Miro whiteboard showing an AI-agent-generated design workshop board with color-coded sticky notes arranged in columns, flow connectors between ideas, and section headers for different design phases
Miro board with agent-generated whiteboard content created through the Miro MCP server

The Miro MCP server follows standard MCP server configuration patterns. Configuration for each agent platform uses the same approaches covered in section 10.2. The key difference is that Miro boards are freeform canvases, not structured design files, so agent outputs tend to be more exploratory and less pixel-precise than Figma or Paper outputs.

As of May 2026, Miro's MCP documentation is limited compared to Figma and Paper. The tool list and configuration details may evolve. Check the official Miro developer docs for the current state.

MagicPattern MCP: Generative Design Patterns

MagicPattern provides generative design patterns --- gradients, mesh gradients, blobs, patterns, and other decorative elements. The MCP server lets agents generate SVG and CSS patterns programmatically rather than browsing a UI library by hand.

Terminal output showing Paper MCP server token synchronization reading 47 design tokens from the Paper canvas and writing them to a local tokens.json file with confirmation of each token category
Paper MCP token sync pulling design tokens from canvas to a local JSON file

This is a niche tool in the MCP ecosystem, but it solves a real problem. Background patterns, decorative dividers, and visual texture are the kinds of tasks that slow down a design workflow. Having the agent generate these on demand, with parameters you control via prompt, is faster than any manual approach.

Configuration follows the standard MCP server pattern. The server exposes tools for generating specific pattern types with parameters like color, scale, density, and seed value. Each call returns SVG or CSS that the agent can write into a design file or codebase.

Paper MCP Server: 22 Tools for Connected Canvas

Paper's MCP server runs inside the Paper Desktop app. When a file is open, the server listens at http://127.0.0.1:29979/mcp. It connects to Cursor, Claude Code, Claude Desktop, Codex, Copilot, OpenCode, and Antigravity. The setup for each platform is covered in section 10.2.

Flow diagram showing an AI agent chaining two MCP servers: reading design specifications from Figma MCP, processing and normalizing design tokens, then writing the transformed output to Paper MCP canvas in a single agent session
Multi-MCP chain: agent reads from Figma, processes tokens, and writes to Paper in one session

Paper exposes 22 tools --- the largest surface of any design tool MCP server. Here is the complete reference.

Tool Category Description
get_basic_info Read File name, page name, node count, artboard list with dimensions
get_selection Read Selected nodes (IDs, names, types, size, artboard)
get_node_info Read Node details by ID (size, visibility, lock, parent, children, text)
get_children Read Direct children of a node
get_tree_summary Read Compact text summary of subtree (optional depth limit)
get_screenshot Read Screenshot by ID (base64, optional 1x or 2x scale)
get_jsx Export JSX for node+descendants (Tailwind or inline-styles format)
get_computed_styles Read Computed CSS styles for nodes (batch)
get_fill_image Read Image data from image fill node (base64 JPEG)
get_font_family_info Read Font availability check (local or Google Fonts)
get_guide Workflow Guided workflows (e.g., figma-import)
export Export Export nodes as PNG, JPG, SVG, MP4 with scale overrides
create_artboard Write Create new artboard with name and styles
write_html Write Parse HTML and add/replace nodes
set_text_content Write Set text of Text nodes (batch)
rename_nodes Write Rename layers (batch)
duplicate_nodes Write Deep-clone nodes with descendant ID map
move_nodes Write Reposition or reparent nodes
update_styles Write Update CSS styles on nodes
delete_nodes Write Delete nodes and descendants
finish_working_on_nodes Signal Clear working indicator from artboards

The get_jsx tool is Paper's killer feature for the design-to-code pipeline. It returns production-ready JSX with Tailwind classes directly from the canvas. No translation layer. No approximation. Paper is HTML/CSS natively, so the JSX output is a faithful representation of what is on screen. Chapter 12 covers this pipeline in depth.

The write_html tool enables the reverse direction: code to canvas. The agent generates HTML, calls write_html, and the content appears on the Paper canvas. This bi-directional flow is unique to Paper. Figma can write to canvas via use_figma, but the input format is a structured API call, not raw HTML.

My take: Paper's 22-tool MCP surface is the most complete in the ecosystem. The get_jsx tool alone saves hours of translating design to code. But the Desktop app requirement is a real limitation for Linux users and CI environments. If Paper offered a headless server mode, it would be the clear default for agentic design workflows.

Chaining MCP Servers: Multi-Tool Workflows

Multiple MCP servers can run simultaneously in a single agent session. The agent orchestrates between them using sequential tool calls. This is where MCP gets powerful --- a single agent prompt can trigger a workflow that spans Figma, Paper, Notion, and your codebase.

Three chaining patterns show up repeatedly in production use.

Pattern 1: Figma to Paper token sync. Open the design system file in Figma. Open the Paper file where you want the design system. Prompt the agent: "Create a design system in Paper that matches the Figma file." The agent calls get_variable_defs on the Figma MCP server to read colors, spacing, and typography. Then it calls create_artboard and write_html on the Paper MCP server to create the design system elements. Finally it calls update_styles to apply the tokens from Figma.

# Figma → Paper token sync workflow
# Both MCP servers active in same agent session

# Step 1: Agent reads Figma design system
# Figma MCP: get_variable_defs() → colors, spacing, typography
# Figma MCP: get_screenshot() → visual reference

# Step 2: Agent creates Paper design system
# Paper MCP: create_artboard({ name: "Design System" })
# Paper MCP: write_html({ html: "<div>...design system elements...</div>" })
# Paper MCP: update_styles({ styles: { /* tokens from Figma */ } })

Pattern 2: Notion to Paper content sync. Open the Paper file with placeholder content. Prompt: "Sync the content from the Notion Testimonials database with this frame." The agent reads from the Notion MCP server, then writes the real content into Paper via set_text_content.

# Notion → Paper content sync workflow

# Step 1: Agent reads Notion database
# Notion MCP: query_database({ database_id: "testimonials" })
# Returns: [{ name: "Sarah", quote: "...", company: "..." }, ...]

# Step 2: Agent populates Paper design
# Paper MCP: set_text_content({ updates: [
#   { nodeId: "name-1", text: "Sarah" },
#   { nodeId: "quote-1", text: "..." },
#   { nodeId: "company-1", text: "..." },
#   ...
# ]})

Pattern 3: Design to code and back. The agent reads a design from Figma or Paper via MCP, generates production code, then optionally writes the coded version back to the canvas for visual verification. This pattern is the foundation of the pipeline covered in Chapter 12.

# Design → Code → Canvas verification loop

# Step 1: Read design
# Figma MCP: get_design_context({ url: "https://figma.com/..." })
# Returns: React + Tailwind JSX

# Step 2: Agent generates production code
# (Agent writes files, runs dev server, verifies visually)

# Step 3: Write back to Paper for verification
# Paper MCP: write_html({ html: "<div class='hero'>...</div>" })
# Human reviews on Paper canvas
# Agent iterates based on feedback

Each MCP server uses the currently opened file as context. Figma MCP operates on whatever file you have open in the browser. Paper MCP operates on whatever file you have open in the Paper Desktop app. The agent does not switch files --- you do.

Warning: Long-running agent sessions can cause MCP connection drops. If the agent reports it cannot access a tool that should be available, restart the agent session. LLMs can also hallucinate tool parameters occasionally --- if you see repeated tool call errors, restart the session to reset the agent's tool-use context.

Web Access: Firecrawl and Exa MCP

Agents without live web access are frozen at their training cutoff. They cannot check current documentation, see what a competitor's site looks like today, or verify that a tool's API has not changed since last month. For design work, this limitation is more than inconvenient. It produces stale output. The agent references deprecated APIs, describes outdated UI patterns, and generates code that does not match the current state of the tools it uses.

Two MCP servers address this: Firecrawl for structured web extraction at scale, and Exa for quick ad-hoc web searches within a session.

Firecrawl: Structured Web Extraction

Firecrawl (128k+ GitHub stars) provides search, scrape, crawl, interact, and monitor capabilities via CLI and MCP server. In June 2026, it added installable workflows for repeatable web tasks, including deep research, SEO audits, and website design cloning (source: @firecrawl, retrieved 2026-06-04).

Install the CLI with all capabilities:

npx -y firecrawl-cli@latest init --all --browser

Or add the MCP server to your agent configuration:

{
  "mcpServers": {
    "firecrawl": {
      "command": "npx",
      "args": ["-y", "firecrawl-mcp"]
    }
  }
}

The five capabilities map to different design workflows:

CapabilityWhat it doesDesign use case
SearchWeb search with structured resultsFind design references, competitors, and inspiration
ScrapeExtract content from a single URLCapture a specific page's layout, copy, and styling
CrawlExtract content from an entire siteMap a competitor's full site structure and page patterns
InteractScrape then interact with the page using AI or codeNavigate multi-step flows (onboarding, checkout) to capture all states
MonitorSchedule recurring scrapes with change detectionTrack when a competitor updates their design or a tool changes its docs

The Interact capability is the most interesting for design workflows. It lets the agent scrape a page, then interact with it: click buttons, navigate flows, fill forms. This means the agent can capture multi-state UIs (empty states, loading states, error states, success states) by actually navigating through the product instead of just scraping a single page.

Exa MCP: Quick Web Search Inside Agent Sessions

Exa provides live web search directly inside agent sessions. Setup takes about 60 seconds: type /mcp in Claude Code, approve the Exa server, and the agent gains real-time web access (source: @RoundtableSpace, retrieved 2026-06-04).

Exa is for quick, ad-hoc lookups. It does not crawl or monitor. It searches and returns results. The value for design work is immediacy: the agent can verify current state without leaving the session.

{
  "mcpServers": {
    "exa": {
      "command": "npx",
      "args": ["-y", "exa-mcp-server"],
      "env": {
        "EXA_API_KEY": "your-api-key"
      }
    }
  }
}

When to Use Which

TaskUseWhy
"What does the Stripe landing page look like right now?"ExaQuick single-page reference check
"Crawl this competitor's site and map their page types"FirecrawlStructured extraction at scale
"Check if the Figma REST API v2 endpoint has changed"ExaQuick documentation verification
"Capture all states of this competitor's checkout flow"Firecrawl (Interact)Multi-step navigation with interaction
"Alert me when this site changes its hero section"Firecrawl (Monitor)Scheduled change detection
"Find three examples of SaaS pricing pages with toggle toggles"ExaSearch-based discovery

Practical Workflows

Workflow 1: Competitive design audit.

You want to understand how five competitors handle their onboarding flows. The agent needs to see every screen, not just the landing page.

"Use Firecrawl to crawl these five URLs. For each site, extract:
1. The onboarding flow: every screen from sign-up to first-use
2. UI patterns used: modals, tooltips, progress bars, empty states
3. Copy tone: formal, casual, technical
4. Estimated number of steps in the flow

Summarize the patterns in a comparison table, then suggest which
patterns would work for our product."

Firecrawl's crawl capability handles the scale. The agent gets structured data from every page on each site, identifies the onboarding screens, and produces a comparative analysis. Without Firecrawl, you would visit each site manually, screenshot every screen, and type notes into a document.

Workflow 2: Design cloning from a reference site.

You want to replicate the visual style of a specific site. Firecrawl's design cloning workflow extracts the styling directly.

"Scrape https://linear.app and extract:
1. Color palette (backgrounds, text, accents, borders)
2. Typography (font families, sizes, weights, line heights)
3. Spacing patterns (padding, gaps, margins)
4. Border radius values
5. Shadow values
6. Animation and transition patterns

Output as CSS custom properties I can use as a design token base."

This is faster than manually inspecting elements in DevTools. The agent gets the full styling picture in one call and produces tokens you can drop into your design system.

Workflow 3: Multi-state UI capture.

You need to see all states of a competitor's pricing page: monthly vs annual toggle, different plan selections, upgrade modals, and the checkout flow that follows.

"Use Firecrawl Interact to navigate the pricing page at [URL].
1. Capture the default monthly view
2. Click the annual toggle and capture the updated prices
3. Click the Enterprise plan's 'Contact Sales' button and capture the modal
4. Click the Pro plan's 'Get Started' button and capture the checkout flow
5. For each state, extract the layout structure, copy, and visual hierarchy"

The Interact capability navigates through the product like a real user. The agent captures states that a simple scrape would miss --- the toggle state, the modal state, the multi-step checkout state. This is the closest an agent gets to manual QA without a browser automation tool.

Workflow 4: Documentation verification before code generation.

Before the agent writes code that calls an external API, you want to verify the current endpoint structure.

"Search the web for the current Remotion v4.0 renderMedia() API.
I need to confirm:
1. The correct import path
2. Whether the 'codec' option is still required
3. The current way to specify output location
4. Any breaking changes from v3

Then use the confirmed API to write a render script for our
Hyperframes pipeline."

Exa handles this in seconds. The agent checks the live documentation, confirms the current API shape, and writes code that matches the current version instead of guessing from training data. This prevents the common failure mode where the agent generates code for a deprecated API that no longer works.

Workflow 5: Competitor monitoring.

You want to know when a competitor updates their landing page design.

# Set up a recurring monitor
firecrawl monitor create \
  --url "https://competitor.com" \
  --schedule "daily" \
  --prompt "Has the hero section, pricing section, or navigation changed since the last check? If yes, describe the changes. If no, respond 'no changes'." \
  --notify "slack:#design-intel"

The monitor runs daily. When the competitor updates their design, you get a Slack notification with a description of what changed. This is passive competitive intelligence --- you do not need to check manually, and you never miss a change.

Workflow 6: Design reference collection.

You are starting a new project and want to collect visual references for the design direction.

"Use Exa to find:
1. Five examples of developer tool landing pages with dark themes
2. Three examples of SaaS pricing pages with toggle between monthly/annual
3. Four examples of onboarding flows that use progressive disclosure

For each result, scrape the page with Firecrawl and extract:
- Screenshot URL
- Primary color palette
- Layout pattern (hero + features grid, hero + alternating sections, etc.)
- Notable design elements (animations, illustrations, 3D elements)

Organize into a reference board I can share with the team."

Exa finds the sites. Firecrawl extracts the details. The agent produces a structured reference collection that replaces hours of manual browsing and screenshotting.

Security Considerations

Giving an agent live web access expands its capabilities and its attack surface. A few precautions:

  • Scope Firecrawl to specific domains. Do not point a crawl at the open web. Target known competitors, reference sites, and documentation URLs. Unscoped crawling can consume credits fast and produce noise.
  • Treat Exa results as unverified input. The agent trusts web search results. A convincingly-written blog post with wrong information becomes part of the agent's output. Cross-reference critical facts.
  • Monitor queries in CI. If you use web access in automated workflows, log the queries. You want to know what the agent is searching for and what it finds.
  • Rate-limit monitor jobs. Daily monitoring of 20 competitors is reasonable. Hourly monitoring is not. Set intervals that match the pace of actual change in your industry.

My take: Live web access should be a default, not an add-on. Every design agent should be able to check current state before generating output. The combination of Exa for quick lookups and Firecrawl for structured extraction covers the full range of web access needs in a design workflow. Install both. Use Exa daily for quick checks ("does this API still work?", "what does the current site look like?"). Use Firecrawl when you need structured data at scale (competitive audits, design cloning, multi-state capture). The five minutes of setup pays for itself the first time the agent catches a breaking API change before you ship broken code.

Generative UI: When Agents Render Live Interfaces

Everything covered in this chapter so far follows one model: the agent calls a tool, the tool returns data, and the data appears as text in the conversation. That model works for most agentic design workflows. Read a Figma file. Export JSX. Write HTML to a canvas. The agent produces code. The human reviews the code.

Generative UI inverts this. The agent does not just return data or code. It renders live, interactive components directly into the user's interface. The user clicks buttons, fills forms, drags elements, and sees real UI --- not markdown descriptions of UI. The agent decides what to show. The frontend renders it. The user interacts. The loop closes in real time.

This is a fundamentally different interaction model. Understanding it requires understanding the protocol stack that makes it possible.

The Protocol Stack: MCP, A2A, AG-UI, and A2UI

Four protocols govern how agents communicate with tools, with each other, and with users. They stack.

Three-layer protocol stack diagram showing MCP and A2A at the foundation layer, AG-UI as the transport layer handling event-based streaming via SSE, and A2UI, MCP-UI, and Open-JSON-UI at the application layer
The generative UI protocol stack: three layers from foundation to application

MCP (Model Context Protocol) is the foundation. It connects agents to tools and data. Every MCP server covered in this chapter --- Figma, Paper, Miro, MagicPattern --- uses MCP at the bottom of the stack. MCP is request-response. The agent calls a tool, the tool returns a result. No streaming. No user interface. Just data.

A2A (Agent-to-Agent Protocol) sits alongside MCP at the foundation layer. It handles communication between agents. Where MCP connects one agent to many tools, A2A connects many agents to each other. An A2A agent exposes an Agent Card (a JSON document at /.well-known/agent.json) describing its identity, capabilities, and endpoint. Other agents discover it, negotiate how to communicate, and collaborate on tasks. A2A uses JSON-RPC 2.0 over HTTP with SSE for streaming. It is agent-agnostic --- the agents do not need to share a framework, a model, or a vendor.

The key A2A concepts:

  • Task is the fundamental unit of work. A task has a lifecycle: submitted, working, completed, failed, canceled.
  • Message is one turn in the conversation, with a role (user or agent) and one or more Parts.
  • Part is the smallest content unit: text, a file, or structured JSON data.
  • Artifact is the output the agent produces: a document, an image, a data structure.

For design workflows, A2A enables multi-agent collaboration. A research agent can hand off findings to a design agent, which hands off a prototype to a review agent. Each agent specializes. They coordinate through A2A without sharing context windows or model sessions.

AG-UI (Agent-User Interaction Protocol) is the transport layer. It is the bridge between the agent and the user's frontend application. AG-UI is event-based and streaming. Instead of request-response, the agent emits a continuous stream of events that the frontend consumes in real time.

AG-UI uses Server-Sent Events (SSE) as its primary transport. SSE is a standard HTTP-based protocol where the server pushes events to the client over a long-lived connection. Unlike WebSockets, SSE is unidirectional (server to client) and works through firewalls and proxies without special configuration. The frontend sends user actions as HTTP POST requests. The agent responds with a stream of SSE events.

The event types AG-UI defines:

CategoryEventsWhat it does
LifecycleRunStarted, RunFinished, StepStarted, StepFinishedTrack the agent run from start to finish
TextTextMessageStart, TextMessageContent, TextMessageEndStream text responses token by token
Tool callsToolCallStart, ToolCallArgs, ToolCallEnd, ToolCallResultStream tool invocations and results
StateStateSnapshot, StateDeltaSync application state using JSON Patch
ReasoningReasoningStart, ReasoningContent, ReasoningEndExpose the agent's thinking process

The critical difference from MCP: AG-UI streams. MCP is call-and-return. AG-UI is a continuous event feed. When an agent decides to render a weather card, it emits ToolCallStart (I'm calling get_weather), then ToolCallArgs (the location is Tokyo), then ToolCallEnd (done calling), then ToolCallResult (here's the data). Each event arrives separately. The frontend can render intermediate states: loading spinner while the tool runs, then the card with data when the result arrives.

A2UI (Agent-to-User Interface) is the application layer. It defines what the UI looks like in a format the agent can generate. Where AG-UI handles transport, A2UI handles content.

A2UI is a declarative, JSONL-based specification. The agent does not write React or HTML. It emits a structured description of what UI elements to show. The frontend maps that description to real components from a registered catalog. Three competing specs sit at this layer:

SpecOriginApproach
A2UIGoogleDeclarative JSONL, streaming, platform-agnostic
MCP-UIMicrosoft + ShopifyIframe-based, extends MCP for user-facing experiences
Open-JSON-UIOpenAISchema-based standardization of OpenAI's internal format

These specs compete for the same layer, but AG-UI can carry any of them. CopilotKit, the primary AG-UI client implementation, supports all three.

My take: The four-protocol stack is real, but most design teams will only interact with MCP and AG-UI directly. A2A matters for multi-agent systems (covered in Chapter 11). A2UI matters if you are building a product with generative UI. For design tool integration, MCP is enough. For agent-to-user rendering, you need AG-UI. The others become relevant as your system scales.

Three Patterns for Generative UI

With the protocol stack understood, the practical question is: how does an agent actually render UI in your application? There are three patterns, and they sit on a spectrum from maximum developer control to maximum agent freedom.

Three-column comparison diagram showing Controlled pattern (you write React components, agent picks which to render), Declarative pattern (you register schemas, agent emits JSONL data), and Open-ended pattern (MCP server ships HTML/CSS/JS rendered in sandboxed iframe)
Three patterns for generative UI: controlled (highest control), declarative (best default), open-ended (most flexible, least safe)
Controlled: You Build Components, the Agent Picks

In the controlled pattern, you write React components ahead of time. Each component is registered as a tool the agent can call. The agent decides which component to render and what data to pass. It cannot create components you haven't defined. It cannot change how they look. It picks from your menu.

The mechanism is tool-call rendering. When the agent calls a tool, the frontend intercepts that call and renders a matching React component inline in the conversation.

Horizontal flow diagram showing the five steps of the controlled pattern: agent receives message, agent decides to call tool with parameters, frontend intercepts tool call via useRenderTool, React component renders inline, user sees live interactive UI. Top annotation: agent decides WHAT. Bottom annotation: you decide HOW.
The controlled pattern: the agent decides what tool to call, you decide what component renders

Using CopilotKit, the code looks like this:

// Register a component that renders when the agent calls "get_weather"
useRenderTool({
  name: "get_weather",
  parameters: z.object({ location: z.string() }),
  render: ({ status, parameters, result }) => {
    if (status === "inProgress") return <WeatherSkeleton />;
    if (status === "complete") return <WeatherCard data={result} />;
    if (status === "error") return <ErrorState error={result.error} />;
  },
});

Using the Vercel AI SDK, the pattern is similar but uses typed tool parts:

// Server: define the tool
const weatherTool = createTool({
  description: 'Display the weather for a location',
  inputSchema: z.object({ location: z.string() }),
  execute: async ({ location }) => {
    return { temperature: 22, condition: 'Sunny', location };
  },
});

// Client: render the tool output
{message.parts.map((part) => {
  if (part.type === 'tool-displayWeather') {
    switch (part.state) {
      case 'output-available':
        return <WeatherCard {...part.output} />;
    }
  }
})}

The agent decides what (which tool to call, with what parameters). You decide how (what component renders, what it looks like, what happens when the user clicks).

Token cost. Each registered component costs roughly 400 tokens in the agent's context window. The agent needs to know the tool name, its parameters, and its description to decide when to call it. At 25 components, you burn 10,000 tokens per turn just on tool definitions. This pattern works well for narrow, high-value flows where the UI vocabulary is bounded --- a flight booking flow, a financial dashboard, a settings panel.

Declarative: The Agent Emits a Schema, the App Composes

In the declarative pattern, you define a catalog of component schemas. Each schema describes a type of UI element: a card, a table, a chart, a form, a list. The agent emits a JSONL specification that references schemas from your catalog and fills in data. The frontend composes the actual UI from your registered components.

The difference from controlled: instead of one tool per component, there is one tool that accepts a schema describing what to render. The agent has more creative freedom --- it can combine elements, nest layouts, and compose interfaces you didn't explicitly anticipate. But it stays within the boundaries of your catalog.

A2UI formalizes this with three concepts:

  • Catalog --- component definitions (Zod schemas) paired with React renderers. You register these once: "a card has a title, a subtitle, an image URL, and an action."
  • Surface --- a rendered UI instance the agent creates. The agent says "I want a surface with a card and a table."
  • Operations --- the agent returns a2ui.render(operations=[...]) from a tool call. Operations set components on surfaces and bind data to them.
// Register a catalog with two schemas
const catalog = [
  { schema: z.object({ title: z.string(), body: z.string() }), renderer: Card },
  { schema: z.object({ headers: z.array(z.string()), rows: z.array(z.array(z.string())) }), renderer: Table },
];

// Wrap your app
<CopilotKit a2ui={{ catalog }}>
  {children}
</CopilotKit>

The agent then emits something like:

{"op": "create_surface", "surface_id": "s1"}
{"op": "set_component", "surface_id": "s1", "component": "Card", "data": {"title": "Tokyo Weather", "body": "22°C, Sunny"}}
{"op": "set_component", "surface_id": "s1", "component": "Table", "data": {"headers": ["Day", "Temp"], "rows": [["Mon", "22°C"], ["Tue", "19°C"]]}}

The frontend renders a Card and a Table using the registered renderers. The agent decided the layout and the data. You controlled the visual identity and behavior through your catalog.

Token cost. Flat. One tool definition regardless of how many component types exist in the catalog. The schema is small and reusable. This is why the declarative pattern scales better than controlled when you have dozens of component types.

A2UI supports two flavors. Fixed schema: you author the catalog, the agent supplies data. Dynamic schema: a secondary LLM generates component definitions at runtime. Fixed schema is the safer default for production. Dynamic schema is useful for prototyping when the UI vocabulary is still evolving.

Open-Ended: The Agent Writes Raw HTML

In the open-ended pattern, there is no catalog. No schema. The agent writes HTML, CSS, and JavaScript. The frontend renders it in a sandboxed iframe.

MCP-UI formalizes this. An MCP server registers a tool and links it to an HTML/JS resource. When the agent calls the tool, CopilotKit fetches the resource and renders it as a sandboxed iframe inside the conversation. The iframe can talk back to MCP tools over postMessage.

// MCP server registers a tool with linked UI
server.registerTool("search-flights", {
  inputSchema: { origin: "string", destination: "string", date: "string" },
  _meta: { "ui/resourceUri": "ui://flights/flights-app.html" },
}, handler);

// The linked HTML/JS resource
server.registerResource("flights-app", "ui://flights/flights-app.html", {
  mimeType: "text/html+mcp",
}, () => ({ contents: [{ text: htmlContent }] }));

When the agent calls search-flights, CopilotKit mounts the HTML app inside the chat. The user can click through a multi-step booking wizard. The app talks back to MCP tools via postMessage.

The open-ended pattern has the most expressive range. It can render anything a web page can render: charts, maps, drag-and-drop boards, games. But it has three serious drawbacks:

  1. Brand inconsistency. Without a design system constraining output, every render looks different. The agent's HTML does not inherit your application's styles or tokens.
  2. Security surface. Sandboxed iframes limit the damage, but the agent's HTML runs arbitrary JavaScript. This is not acceptable for user-facing production features.
  3. No persistence. The iframe's state dies when the conversation moves on. There is no way to save the user's interaction state across turns.

Use open-ended for internal tools, throwaway visualizations, and one-off data exploration. Do not ship it to users.

Choosing the Right Pattern

Criterion Controlled Declarative Open-Ended
Component count ≤10 well-defined components 10-100+ component types Unlimited, ad-hoc
Brand consistency Guaranteed (you wrote every component) High (catalog constrains output) None
Token cost per turn ~400 tokens per tool (scales linearly) Flat (one tool, many schemas) Low (one iframe)
Engineering effort High (component per capability) Medium (catalog registration) Low (MCP server ships HTML)
Agent creativity Low (picks from menu) Medium (composes from catalog) High (writes anything)
Security Same origin, full control Catalog-controlled, safe Sandboxed iframe, risky
Best for High-value flows, pixel-perfect UI Products with many content types Internal tools, throwaway viz

The decision tree is simple.

If you have pixel-perfect mockups and a defined component library, use controlled. You know exactly what every state looks like. The agent connects the dots.

If you have dozens of card types, table variants, chart layouts, and form configurations, use declarative. The catalog scales without exploding token costs. This is the right default for most production applications.

If you need a one-off chart for a meeting or an internal data viewer, open-ended is fine. Just don't ship it.

Case Study: A Design System Dashboard with Controlled Pattern

Imagine you maintain a design system with 80 components. You want an internal dashboard where team members can ask an agent questions about the system in natural language and see live component previews.

You register six tools, each with a corresponding React component:

// Tools registered with the agent
const designSystemTools = {
  "search_components":     { renders: ComponentListCard },
  "show_component_detail": { renders: ComponentDetailPanel },
  "compare_components":    { renders: ComponentComparisonTable },
  "show_token_usage":      { renders: TokenUsageChart },
  "show_accessibility":    { renders: A11yReportCard },
  "show_responsive":       { renders: ResponsivePreview },
};

A designer types: "Show me all button variants and their accessibility status."

The agent calls two tools: search_components with {query: "button variants"} and show_accessibility with {component: "Button"}. Two React components render inline: a list of all button variants with links to Storybook, and an accessibility report card showing contrast ratios and keyboard behavior for each variant.

The designer clicks a button variant in the list. The agent calls show_component_detail with the selected variant. A detail panel expands inline with the component's props, variants, design tokens, and a live preview.

Everything the user sees is a React component you built. The agent never generates unfamiliar UI. But the interaction feels conversational: the user asks in natural language, the agent orchestrates the right components with the right data.

Case Study: A Product Catalog with Declarative Pattern

Now imagine an e-commerce assistant. The product catalog has 50,000 items across 200 categories. The UI vocabulary is large but structured: product cards, price tables, comparison grids, review summaries, filter panels, and cart widgets.

Building 200+ controlled components is impractical. Instead, you register a catalog of 12 component schemas:

const catalog = [
  { schema: "product_card",   renderer: ProductCard },
  { schema: "price_table",    renderer: PriceTable },
  { schema: "comparison",     renderer: ComparisonGrid },
  { schema: "review_summary", renderer: ReviewSummary },
  { schema: "filter_panel",   renderer: FilterPanel },
  { schema: "cart_widget",    renderer: CartWidget },
  { schema: "image_gallery",  renderer: ImageGallery },
  { schema: "size_selector",  renderer: SizeSelector },
  { schema: "spec_table",     renderer: SpecTable },
  { schema: "availability",   renderer: AvailabilityBadge },
  { schema: "recommendation", renderer: RecommendationRow },
  { schema: "order_summary",  renderer: OrderSummary },
];

A customer asks: "I need running shoes under $150 with good arch support."

The agent emits a JSONL specification:

{"op": "create_surface", "surface_id": "search-results"}
{"op": "set_component", "surface_id": "search-results", "component": "filter_panel",
 "data": {"filters": [{"type": "category", "value": "Running Shoes"}, {"type": "price_max", "value": "$150"}, {"type": "feature", "value": "Arch Support"}]}}
{"op": "set_component", "surface_id": "search-results", "component": "recommendation",
 "data": {"items": [{"name": "Nike Pegasus 41", "price": "$140", "rating": 4.6}, {"name": "Brooks Ghost 16", "price": "$140", "rating": 4.7}]}}
{"op": "set_component", "surface_id": "search-results", "component": "comparison",
 "data": {"headers": ["Feature", "Nike Pegasus", "Brooks Ghost"], "rows": [["Cushioning", "React X", "DNA LOFT"], ["Drop", "10mm", "12mm"], ["Weight", "9.4oz", "9.8oz"]]}}

The frontend renders a filter panel, a recommendation row, and a comparison grid. The agent composed these from the catalog. The visual consistency comes from your registered renderers. The agent had the creative freedom to decide that a comparison grid was useful for this query and to compose it with the filter panel.

The token cost is the same whether the catalog has 12 schemas or 200. One tool. One definition. Many possible compositions.

Case Study: An Internal Data Explorer with Open-Ended Pattern

An analytics team needs an internal tool where agents can generate ad-hoc visualizations from database queries. The team does not want to pre-build chart types. They want the agent to invent the right visualization for each query.

They set up an MCP server that exposes a run_query tool linked to an HTML visualization resource. When the agent runs a query, the MCP server returns both the data and a rendering script. CopilotKit mounts the result in a sandboxed iframe.

An analyst asks: "Show me the conversion funnel by traffic source for the last 30 days."

The agent calls run_query with the SQL. The MCP server returns the data and an HTML/CSS/JS visualization that renders a Sankey diagram. The analyst sees a live, interactive chart inside the conversation.

This works because the visualization is throwaway. Nobody will reuse this Sankey diagram. It answers one question and disappears. The open-ended pattern is correct here because the UI vocabulary is unbounded and the output does not need to match a brand system.

But if the same analyst asks "build me a dashboard I can share with the CEO," the open-ended pattern fails. The dashboard needs consistent styling, responsive behavior, and persistence. That requires the controlled or declarative pattern.

How This Relates to Agentic Design

Generative UI is not just "agents making UI." It is a structured interaction model with specific protocols, patterns, and trade-offs.

For the agentic design workflows covered in this book, generative UI matters in two places:

  1. Design tool interfaces. Paper's canvas is already an HTML surface. An agent with AG-UI could render live component previews inside Paper, not just write HTML to a static canvas. The designer sees the agent's output as real UI, not code that needs to be previewed separately.
  2. Design system dashboards. Teams maintaining design systems can build conversational interfaces where anyone asks questions about tokens, components, or patterns and gets live rendered answers. The controlled pattern guarantees every response looks like it belongs to the design system.

The protocol stack also clarifies where MCP ends and generative UI begins. MCP connects agents to tools. AG-UI connects agents to users. If you want the agent to render UI in your application, you need AG-UI on top of MCP. If you only want the agent to call tools and return data, MCP alone is sufficient.

My take: The declarative pattern is the one most design teams should bet on. It gives agents enough creative freedom to compose useful interfaces while keeping the visual output inside your design system boundaries. The controlled pattern is overkill for most products unless you have a small, well-defined set of flows. The open-ended pattern is a trap for anything customer-facing. Start with A2UI's fixed schema flavor. Expand to dynamic schema only when the UI vocabulary outgrows what you can pre-define.

Resources: See AG-UI Protocol for the event specification and SDKs. See CopilotKit for the primary AG-UI client implementation. See Vercel AI SDK for the alternative tool-call-to-component approach.

Troubleshooting MCP Connections

MCP connections fail in predictable ways. Here is a troubleshooting reference for the most common issues.

Symptom Cause Fix
Agent says MCP tool is not available Agent session started before MCP server was configured Restart the agent session
Agent session cannot connect to Paper MCP Paper Desktop app is not running or no file is open Open Paper Desktop and ensure a file is loaded
Paper MCP connected but agent reports no access MCP host tool needs restart Restart the MCP host (Paper Desktop app)
Figma MCP returns errors on large designs Large, deeply nested designs exceed response limits Select specific frames instead of entire pages
Figma SVG fills returned as images Known Figma MCP behavior Use get_design_context for code output instead
Paper MCP fails on Windows WSL WSL cannot access 127.0.0.1:29979 Enable mirrored mode networking in WSL config
MCP limits not reset after Paper plan upgrade Desktop app caches plan limits Update Paper Desktop app and restart
Repeated tool call errors from agent LLM hallucinating tool parameters Restart the agent session

I spent an hour debugging an MCP connection before I realized the Paper Desktop app was not running. The error message was generic --- "connection refused" --- which pointed me toward network issues instead of the obvious cause. Start with the basics: is the tool running, is a file open, is the server listening?

Figma MCP has additional quirks. It ignores spacer elements. It does not convert inset borders to outlines. Code Connect components are not reliably converted. These are known limitations as of May 2026 and will likely improve over time. For now, work around them by selecting specific frames and validating the output.

My take: MCP connection issues are the number one source of frustration when starting with agentic design. The tools work well when they work, but the failure modes are silent and confusing. I recommend keeping a terminal tab open with a test MCP call ready to paste. When the agent reports a connection issue, verify independently before spending time on agent-side debugging.

Next: With MCP servers configured and connected, the next question is scale. How do multiple agents collaborate on a single design? Chapter 11 covers multi-agent design teams, the orchestrator pattern, and when parallelizing is worth the complexity overhead.

Next Chapter

Multi-Agent Design Teams

Concurrent agents, task delegation, and orchestration patterns

Continue Reading

©2026 Mehran Mozaffari. All rights reserved.