Chapter 2: Architecture

The Big Picture

MCP’s architecture looks deceptively simple. Three roles. Two connection types. A handful of capabilities. You could sketch it on a napkin.

But like all good protocol designs, the simplicity is earned. Underneath the clean surface are carefully considered trust boundaries, a flexible capability system, and an architecture that scales from “a single tool on my laptop” to “a fleet of remote servers behind an API gateway.”

Let’s unpack it.

The Three-Layer Cake

MCP’s architecture has three layers, and their relationship is strict:

┌─────────────────────────────────────────────┐
│                    HOST                      │
│  (Claude Desktop, VS Code, Your App, etc.)  │
│                                              │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  │
│  │ MCP      │  │ MCP      │  │ MCP      │  │
│  │ Client 1 │  │ Client 2 │  │ Client 3 │  │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  │
└───────┼──────────────┼──────────────┼────────┘
        │              │              │
        │ MCP          │ MCP          │ MCP
        │ Protocol     │ Protocol     │ Protocol
        │              │              │
   ┌────┴─────┐  ┌────┴─────┐  ┌────┴─────┐
   │ MCP      │  │ MCP      │  │ MCP      │
   │ Server A │  │ Server B │  │ Server C │
   │(Files)   │  │(GitHub)  │  │(Database)│
   └──────────┘  └──────────┘  └──────────┘

The Host

The host is the outermost layer. It’s the application the human uses—the thing with a UI (or a CLI, for the terminally inclined). The host has three jobs:

Manage the LLM — Send prompts, receive completions, handle the conversation
Create and manage MCP clients — Spawn them, configure them, tear them down
Enforce security policies — Decide what the model is allowed to do, get human approval when needed

The host is the only layer that directly interacts with the human. This makes it the trust anchor of the entire system. If the host is compromised, all bets are off. If the host is well-built, it can enforce security policies regardless of what clients or servers try to do.

Examples of hosts:

Claude Desktop — Anthropic’s desktop application
Claude Code — Anthropic’s CLI coding agent
VS Code + GitHub Copilot — Microsoft’s IDE with AI extensions
Cursor — AI-first code editor
Windsurf — Another AI-powered IDE
Your custom application — Anything you build that embeds an LLM

The Client

Each host contains one or more MCP clients. Each client maintains a stateful, 1:1 session with a single MCP server. This is a critical design decision—clients and servers are always paired.

The client’s responsibilities:

Connection lifecycle — Connect to the server, perform initialization, handle disconnection
Protocol compliance — Send well-formed JSON-RPC messages, handle responses
Capability tracking — Remember what the server supports and what it doesn’t
Message routing — Forward tool calls, resource requests, and other messages

Clients are not shared. If your host connects to three servers, it creates three clients. Each client knows about exactly one server. This isolation is intentional—it means a misbehaving server can’t interfere with other connections.

The Server

MCP servers are where the capabilities live. Each server is a focused program that exposes one or more of MCP’s three primitives:

Tools — Executable functions the model can call
Resources — Data the model can read
Prompts — Reusable interaction templates

Servers are intentionally lightweight. The reference filesystem server is a few hundred lines of code. The reference GitHub server wraps the GitHub API in a thin MCP layer. Servers don’t need to know about LLMs, conversation management, or user interfaces—they just expose capabilities and respond to requests.

The Connection Lifecycle

When a host wants to use an MCP server, the following dance occurs:

Step 1: Transport Setup

The host spawns or connects to the server using one of MCP’s transport mechanisms:

stdio — For local servers, the host spawns the server as a child process and communicates via stdin/stdout
Streamable HTTP — For remote servers, the client sends HTTP requests to the server’s endpoint

Step 2: Initialization Handshake

Once the transport is up, the client and server perform a handshake. The client sends an initialize request:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": {
    "protocolVersion": "2025-11-25",
    "capabilities": {
      "roots": {
        "listChanged": true
      },
      "sampling": {}
    },
    "clientInfo": {
      "name": "MyApp",
      "version": "1.0.0"
    }
  }
}

The client declares:

What protocol version it speaks
What capabilities it supports (can it provide roots? Can it handle sampling requests?)
Who it is (client info)

The server responds with its own capabilities:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "protocolVersion": "2025-11-25",
    "capabilities": {
      "tools": {
        "listChanged": true
      },
      "resources": {
        "subscribe": true,
        "listChanged": true
      },
      "prompts": {
        "listChanged": true
      }
    },
    "serverInfo": {
      "name": "filesystem-server",
      "version": "2.1.0"
    }
  }
}

The server declares:

What protocol version it speaks (must match or be compatible)
What capabilities it offers (tools? resources? prompts? subscriptions?)
Who it is (server info)

Step 3: Initialized Notification

After receiving the server’s response, the client sends an initialized notification to signal that the handshake is complete:

{
  "jsonrpc": "2.0",
  "method": "notifications/initialized"
}

This three-step handshake (initialize request → initialize response → initialized notification) establishes the session. After this, the client and server can exchange messages according to their negotiated capabilities.

Step 4: Normal Operation

Now the fun begins. The client can:

List tools, resources, and prompts
Call tools
Read resources
Subscribe to resource changes
Receive notifications from the server

Step 5: Shutdown

Either side can terminate the connection. For stdio transports, the host can simply kill the server process. For HTTP transports, the client stops sending requests and the session expires.

Capability Negotiation: The Gentleman’s Agreement

One of MCP’s most elegant design decisions is capability negotiation. Instead of requiring all servers to implement everything, each server declares exactly what it supports.

A server that only provides tools:

{
  "capabilities": {
    "tools": {}
  }
}

A server that provides resources with subscription support:

{
  "capabilities": {
    "resources": {
      "subscribe": true,
      "listChanged": true
    }
  }
}

A server that provides everything:

{
  "capabilities": {
    "tools": { "listChanged": true },
    "resources": { "subscribe": true, "listChanged": true },
    "prompts": { "listChanged": true },
    "logging": {}
  }
}

Clients MUST NOT send requests for capabilities the server hasn’t declared. If a server doesn’t declare resources, the client must not send resources/list. This keeps servers simple—they only need to handle what they’ve opted into.

The listChanged flag deserves special attention. When set to true, it means the server will proactively notify clients when the list of available tools/resources/prompts changes at runtime. This enables dynamic scenarios where tools appear and disappear based on server state.

Trust Boundaries

MCP defines clear trust boundaries between its three layers, and understanding them is critical for building secure applications.

Human
  ↕ (trusts)
Host
  ↕ (controls)
Client
  ↕ (does NOT inherently trust)
Server

The Host Trusts the Human

The host is the human’s agent. It acts on their behalf. When the host shows a confirmation dialog (“Allow this tool to delete files?”), it’s the human making the security decision.

The Host Controls the Client

The host creates the client, configures it, and can terminate it at will. The client operates within bounds set by the host. The host can intercept any message, block any tool call, and enforce any policy.

The Client Does NOT Inherently Trust the Server

This is the critical boundary. Servers are external. They could be malicious, buggy, or compromised. The client (and by extension, the host) must treat server-provided data with appropriate suspicion.

Specifically:

Tool descriptions could be misleading — A server could describe a tool as “read-only” when it actually deletes data
Tool annotations are hints, not guarantees — The readOnlyHint field is advisory, not enforced
Resource data could be manipulated — A server could return subtly altered file contents
Prompt templates could contain injection attacks — A server could craft prompts designed to manipulate the LLM

Good hosts implement defense in depth:

Show the user what tools are available and let them approve usage
Display tool inputs before execution (so the user can catch data exfiltration)
Validate tool outputs before feeding them to the LLM
Implement timeouts and rate limits
Log everything for audit

Servers Don’t Trust Clients Either

The trust boundary works both ways. A server exposing sensitive data should:

Authenticate incoming connections
Authorize requests based on client identity
Rate-limit to prevent abuse
Validate all inputs regardless of what the client claims about itself

Message Flow in Practice

Let’s trace a complete interaction. A user asks Claude Desktop: “What files are in the /tmp directory?”

1. User types question in Claude Desktop (Host)

2. Host sends the question to Claude (the LLM)
   along with the list of available tools from all connected MCP servers

3. Claude decides to use the "list_directory" tool
   and returns a tool_use response

4. Host receives the tool_use, finds the right MCP Client
   (the one connected to the filesystem server)

5. Client sends a tools/call request to the Server:
   {
     "jsonrpc": "2.0",
     "id": 42,
     "method": "tools/call",
     "params": {
       "name": "list_directory",
       "arguments": { "path": "/tmp" }
     }
   }

6. Server executes the tool (lists /tmp directory)
   and returns the result:
   {
     "jsonrpc": "2.0",
     "id": 42,
     "result": {
       "content": [{
         "type": "text",
         "text": "file1.txt\nfile2.log\nsubdir/"
       }]
     }
   }

7. Client forwards the result to the Host

8. Host feeds the tool result back to Claude

9. Claude generates a natural language response:
   "The /tmp directory contains file1.txt, file2.log,
    and a subdirectory called subdir."

10. Host displays the response to the user

Notice how the server never talks to the LLM directly. It doesn’t even know an LLM is involved. It receives a request, does its thing, and returns a result. The host orchestrates everything.

This separation is powerful. The same server works regardless of which LLM the host uses. The same server works in Claude Desktop, VS Code, Cursor, or your custom application. The server doesn’t need to change—only the host needs to know about the LLM.

Local vs. Remote: Two Deployment Models

MCP supports two fundamentally different deployment models, and understanding the distinction is important.

Local Servers (stdio transport)

┌──────────────────────────────────┐
│          User's Machine          │
│                                  │
│  ┌──────────┐    ┌────────────┐ │
│  │   Host   │───→│ MCP Server │ │
│  │          │←───│ (child     │ │
│  │          │    │  process)  │ │
│  └──────────┘    └────────────┘ │
└──────────────────────────────────┘

Local servers run as child processes on the same machine as the host. The host spawns them, communicates via stdin/stdout, and kills them when done.

Advantages:

Zero network overhead — Communication is through OS pipes
No authentication needed — The server runs under the user’s own OS permissions
Simple deployment — Just install the server binary/package
Full local access — The server can read files, run commands, access local databases

Disadvantages:

Tied to one machine — Can’t be shared across devices
Resource consumption — Each server is a separate process
Platform-dependent — May need different builds for different OSes

Remote Servers (Streamable HTTP transport)

┌──────────────┐         ┌──────────────┐
│ User's       │  HTTPS  │ Remote       │
│ Machine      │────────→│ Server       │
│              │←────────│              │
│  ┌────────┐  │         │ ┌──────────┐ │
│  │  Host  │  │         │ │MCP Server│ │
│  └────────┘  │         │ └──────────┘ │
└──────────────┘         └──────────────┘

Remote servers run on a different machine (or in the cloud) and communicate over HTTP/HTTPS.

Advantages:

Shared across devices — Any client anywhere can connect
Centrally managed — Updates, monitoring, scaling in one place
Access to remote resources — Can talk to cloud APIs, databases, services
Multi-tenant — Serve multiple users from one deployment

Disadvantages:

Requires authentication — Must verify client identity
Network latency — Every request traverses the network
More complex deployment — Need hosting, TLS, monitoring
Security surface — Exposed to the internet

The MCP specification supports both models through its transport layer, and many real-world deployments use a mix of both.

How MCP Fits With Everything Else

MCP doesn’t exist in a vacuum. It fits into a broader ecosystem:

┌─────────────────────────────────────────────┐
│              Your AI Application             │
│                                              │
│  ┌─────────────┐  ┌──────────────────────┐  │
│  │ LLM API     │  │ MCP Client(s)        │  │
│  │ (Claude,    │  │                      │  │
│  │  GPT, etc.) │  │ Connects to MCP      │  │
│  │             │  │ servers for tools,    │  │
│  │ Provides:   │  │ resources, prompts   │  │
│  │ - Chat      │  │                      │  │
│  │ - Tool use  │  │                      │  │
│  │ - Vision    │  │                      │  │
│  └─────────────┘  └──────────────────────┘  │
│                                              │
│  ┌─────────────┐  ┌──────────────────────┐  │
│  │ Agent       │  │ Your Business Logic  │  │
│  │ Framework   │  │                      │  │
│  │ (optional)  │  │ Custom code that     │  │
│  │             │  │ orchestrates the LLM │  │
│  │ LangChain,  │  │ and MCP connections  │  │
│  │ CrewAI,     │  │                      │  │
│  │ etc.        │  │                      │  │
│  └─────────────┘  └──────────────────────┘  │
└─────────────────────────────────────────────┘

MCP is the standard interface to external capabilities. The LLM API is the standard interface to the model. Agent frameworks (if you use them) provide orchestration patterns. Your business logic ties it all together.

MCP doesn’t replace any of these—it complements them by standardizing the one part they all had to reinvent: talking to external tools and data sources.

Design Principles

MCP’s architecture reflects several deliberate design principles:

1. Servers Should Be Easy to Build

The simplest useful MCP server is about 30 lines of code. The protocol doesn’t require servers to implement complex state management, session handling, or authentication (though they can). This low barrier to entry is why the ecosystem has grown so quickly.

2. Hosts Control Everything

The host is always in charge. It decides which servers to connect to, what tools the model can see, whether to execute a tool call, and what to do with results. Servers propose; hosts dispose.

3. Isolation by Default

Each client-server pair is isolated. A misbehaving server can’t crash other connections. A slow server can’t block other tool calls. This isolation makes the system more resilient and easier to reason about.

4. Progressive Complexity

A server can start by exposing a single tool with no authentication. As needs grow, it can add resources, prompts, subscriptions, authentication, and more. You only pay for the complexity you need.

5. Protocol, Not Framework

MCP specifies the what (messages, semantics, behavior) but not the how (implementation, libraries, patterns). This means it can be implemented in any language, embedded in any framework, and adapted to any use case.

Summary

MCP’s architecture is a three-layer system (host → client → server) with clear trust boundaries, flexible capability negotiation, and support for both local and remote deployment. The host controls everything, clients are isolated connectors, and servers are focused capability providers.

The architecture is simple enough to understand in an afternoon and flexible enough to power everything from a weekend hack to an enterprise deployment. That’s not an accident—it’s the result of careful protocol design.

Next, let’s look at the wire format that makes it all work.

Keyboard shortcuts

The Model Context Protocol Book