Chapter 14: Advanced Features

Beyond the Basics

Chapters 4–6 covered MCP’s three primitives: tools, resources, and prompts. Those are the headline features. But MCP has several advanced capabilities that become essential as you build more sophisticated integrations.

This chapter covers sampling, elicitation, roots, completion, and logging—the features that turn a simple tool server into a sophisticated AI integration.

Sampling: When Servers Need an LLM

Here’s a scenario: you’re building an MCP server that summarizes web pages. The server can fetch the web page, but it needs an LLM to actually summarize it. Does the server need its own LLM API key?

No. MCP has a feature called sampling that lets servers request LLM completions from the client. The server says “I have this text, please summarize it,” and the client routes the request to whatever LLM it’s using.

How Sampling Works

Server                    Client                    LLM
  │                         │                        │
  │── sampling/            │                        │
  │   createMessage ──────→│                        │
  │                         │── API call ──────────→│
  │                         │←── Completion ────────│
  │←── Result ─────────────│                        │

The server sends a sampling/createMessage request to the client:

{
  "jsonrpc": "2.0",
  "id": "s1",
  "method": "sampling/createMessage",
  "params": {
    "messages": [
      {
        "role": "user",
        "content": {
          "type": "text",
          "text": "Summarize this article in 2-3 sentences:\n\n[article text here]"
        }
      }
    ],
    "maxTokens": 200,
    "modelPreferences": {
      "hints": [
        { "name": "claude-sonnet-4-5-20250929" }
      ],
      "speedPriority": 0.5,
      "costPriority": 0.8,
      "intelligencePriority": 0.3
    },
    "systemPrompt": "You are a concise summarizer."
  }
}

The client:

Receives the request
May show it to the user for approval (sampling gives servers indirect LLM access—this should be gated)
Sends it to the LLM
Returns the result

{
  "jsonrpc": "2.0",
  "id": "s1",
  "result": {
    "role": "assistant",
    "content": {
      "type": "text",
      "text": "The article discusses the recent breakthrough in quantum computing..."
    },
    "model": "claude-sonnet-4-5-20250929",
    "stopReason": "endTurn"
  }
}

Model Preferences

The server can hint at what kind of model it wants:

{
  "modelPreferences": {
    "hints": [
      { "name": "claude-sonnet-4-5-20250929" },
      { "name": "claude-haiku-4-5-20251001" }
    ],
    "speedPriority": 0.8,
    "costPriority": 0.9,
    "intelligencePriority": 0.3
  }
}

The hints are suggestions, not demands. The client chooses the actual model. The priority fields (0.0 to 1.0) express trade-offs: this request prioritizes speed and cost over intelligence, so a smaller, faster model would be appropriate.

When to Use Sampling

Content transformation — Summarize, translate, reformat
Intelligent processing — Extract entities, classify data, generate descriptions
Agentic delegation — Let the server compose multi-step operations where some steps need LLM reasoning

Security Considerations

Sampling gives servers indirect access to the LLM. A malicious server could:

Generate harmful content through the LLM
Use the LLM to process stolen data
Rack up LLM API costs

Hosts SHOULD:

Require user approval for sampling requests
Show the server’s prompt to the user
Set limits on token usage
Log all sampling requests

Elicitation: Asking the User

Sometimes a server needs information from the user—not the LLM, the actual human. Maybe it needs to confirm a destructive action, choose between options, or provide credentials.

Elicitation lets servers send questions to the user through the client. The client presents the question in its UI, collects the answer, and returns it to the server.

How Elicitation Works

Server sends an elicitation request:

{
  "jsonrpc": "2.0",
  "id": "e1",
  "method": "elicitation/create",
  "params": {
    "message": "Which database environment should I connect to?",
    "requestedSchema": {
      "type": "object",
      "properties": {
        "environment": {
          "type": "string",
          "enum": ["development", "staging", "production"],
          "description": "Target environment"
        }
      },
      "required": ["environment"]
    }
  }
}

The client shows this to the user (perhaps as a dropdown or radio buttons), and returns their choice:

{
  "jsonrpc": "2.0",
  "id": "e1",
  "result": {
    "action": "accept",
    "content": {
      "environment": "staging"
    }
  }
}

If the user declines:

{
  "jsonrpc": "2.0",
  "id": "e1",
  "result": {
    "action": "decline"
  }
}

Use Cases

Configuration choices — “Which project should I work with?”
Confirmation — “This will delete 500 records. Continue?”
Input collection — “What’s your API key for this service?”
Disambiguation — “I found 3 matching records. Which one?”

Schema Support

The requestedSchema uses JSON Schema to define what input the server expects. This lets clients render appropriate UI:

String → text field
String with enum → dropdown
Boolean → checkbox
Number → number input

Roots: Workspace Context

Roots tell servers about the client’s workspace—what directories or repositories are relevant to the current session. This helps servers scope their operations to the right context.

How Roots Work

When a server wants to know about the workspace, it sends a roots/list request:

{
  "jsonrpc": "2.0",
  "id": "r1",
  "method": "roots/list"
}

The client responds:

{
  "jsonrpc": "2.0",
  "id": "r1",
  "result": {
    "roots": [
      {
        "uri": "file:///home/user/projects/my-app",
        "name": "My App"
      },
      {
        "uri": "file:///home/user/projects/shared-lib",
        "name": "Shared Library"
      }
    ]
  }
}

If the workspace changes (user opens a different folder, adds a workspace), the client sends a notification:

{
  "jsonrpc": "2.0",
  "method": "notifications/roots/list_changed"
}

Why Roots Matter

Without roots, a filesystem server has to guess where to look. With roots, it knows:

What directories are relevant
What the user is working on
Where to scope searches and operations

A Git server uses roots to know which repositories to show. A lint server uses roots to know which files to check. A build server uses roots to know what to compile.

Completion: Autocomplete for Arguments

MCP supports autocompletion for prompt arguments and resource URI template parameters. When a user is filling in a prompt’s arguments, the client can request completions:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "completion/complete",
  "params": {
    "ref": {
      "type": "ref/prompt",
      "name": "query_table"
    },
    "argument": {
      "name": "table_name",
      "value": "us"
    }
  }
}

Response:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "completion": {
      "values": ["users", "user_sessions", "user_preferences"],
      "hasMore": false,
      "total": 3
    }
  }
}

This works for resource template parameters too:

{
  "params": {
    "ref": {
      "type": "ref/resource",
      "uri": "postgres://localhost/mydb/tables/{table}/schema"
    },
    "argument": {
      "name": "table",
      "value": "ord"
    }
  }
}

Implementation

Servers implement completion by providing context-aware suggestions:

@mcp.complete("query_table")
async def complete_table_name(argument: str, value: str) -> list[str]:
    if argument == "table_name":
        # Query actual database tables that match the prefix
        conn = get_connection()
        tables = conn.execute(
            "SELECT name FROM sqlite_master WHERE type='table' AND name LIKE ?",
            (f"{value}%",)
        ).fetchall()
        return [row[0] for row in tables]
    return []

Logging: Server Diagnostics

Servers can send structured log messages to the client. This isn’t just “print debugging”—it’s a proper logging channel that clients can filter, display, and record.

Sending Log Messages

{
  "jsonrpc": "2.0",
  "method": "notifications/message",
  "params": {
    "level": "info",
    "logger": "weather-api",
    "data": "Fetching weather for London (cache miss)"
  }
}

Log Levels

MCP uses syslog severity levels:

Level	Severity	Use For
`debug`	Lowest	Detailed diagnostic information
`info`	Low	Normal operation events
`notice`	Medium	Normal but noteworthy events
`warning`	Medium-High	Something unexpected, but handled
`error`	High	Something failed
`critical`	Very High	System component failure
`alert`	Very High	Immediate action needed
`emergency`	Highest	System is unusable

Setting Log Level

Clients can control verbosity:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "logging/setLevel",
  "params": {
    "level": "warning"
  }
}

After this, the server should only send warning and above. This prevents verbose debug logging from overwhelming the client.

Implementation

from mcp.server.fastmcp import FastMCP, Context

mcp = FastMCP("my-server")

@mcp.tool()
async def complex_operation(data: str, ctx: Context) -> str:
    """Perform a complex operation."""
    await ctx.debug("Starting operation")
    await ctx.info(f"Processing {len(data)} bytes")

    try:
        result = await process(data)
        await ctx.info(f"Operation complete: {result.summary}")
        return str(result)
    except TimeoutError:
        await ctx.warning("Operation timed out, retrying...")
        result = await process(data, timeout=60)
        return str(result)
    except Exception as e:
        await ctx.error(f"Operation failed: {e}")
        raise

Progress Reporting: Keeping Everyone Informed

For long-running operations, servers can report progress. This was covered briefly in Chapter 3, but it’s worth a closer look at implementation.

The Flow

Client includes a progress token in the request:

{
  "method": "tools/call",
  "params": {
    "name": "bulk_import",
    "arguments": { "file": "data.csv" },
    "_meta": { "progressToken": "import-1" }
  }
}

Server sends progress notifications:

{
  "method": "notifications/progress",
  "params": {
    "progressToken": "import-1",
    "progress": 250,
    "total": 1000,
    "message": "Importing row 250 of 1000..."
  }
}

When done, the server returns the normal tool result.

Implementation

@mcp.tool()
async def import_data(file_path: str, ctx: Context) -> str:
    """Import a large data file."""
    rows = load_csv(file_path)
    total = len(rows)

    for i, row in enumerate(rows):
        await insert_row(row)

        # Report progress every 100 rows
        if i % 100 == 0:
            await ctx.report_progress(i, total, f"Importing row {i} of {total}")

    await ctx.report_progress(total, total, "Import complete")
    return f"Successfully imported {total} rows"

Indeterminate Progress

When you don’t know the total, omit it:

{
  "method": "notifications/progress",
  "params": {
    "progressToken": "crawl-1",
    "progress": 42,
    "message": "Crawled 42 pages so far..."
  }
}

The client can show a spinner or counter instead of a progress bar.

Combining Advanced Features

The real power comes from combining these features. Here’s a server that uses sampling, elicitation, roots, and progress reporting together:

@mcp.tool()
async def smart_refactor(pattern: str, replacement: str, ctx: Context) -> str:
    """Find and replace across the project, with AI-powered review of each change."""
    # Use roots to know where to search
    # (The client provides roots during initialization)

    # Find all matches
    matches = find_in_project(pattern)

    if not matches:
        return f"No matches found for '{pattern}'"

    # Elicit confirmation from the user
    # (elicitation would happen through the client's UI)
    await ctx.info(f"Found {len(matches)} matches for '{pattern}'")

    total = len(matches)
    applied = 0

    for i, match in enumerate(matches):
        await ctx.report_progress(i, total, f"Reviewing {match.file}:{match.line}")

        # Use sampling to have the LLM review each change
        # (The server asks the client's LLM to evaluate the change)
        context = get_surrounding_code(match.file, match.line, radius=5)

        await ctx.info(f"Reviewing change in {match.file}")
        # In a real implementation, you'd use sampling here
        # to ask the LLM if the replacement makes sense in context

        applied += 1

    return f"Applied {applied} of {total} replacements"

Tasks: Long-Running Operations (Experimental)

The 2025-11-25 spec revision introduced tasks, an experimental feature for operations that take too long for a simple request-response cycle.

The Problem Tasks Solve

Some tool calls finish in milliseconds (calculator, string manipulation). Others take minutes or hours (data pipeline, ML training, complex analysis). Without tasks, the client has to hold an open connection the entire time, and if it drops, the work is lost.

How Tasks Work

Tasks are durable state machines. Instead of waiting for a tool to finish, the server immediately returns a task ID. The client can then poll for status or wait for completion.

Step 1: Client makes a task-augmented request

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "train_model",
    "arguments": { "dataset": "training_data.csv" },
    "task": {
      "ttl": 3600
    }
  }
}

The task field with a ttl (time-to-live in seconds) tells the server to run this as a task.

Step 2: Server immediately returns a task reference

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "taskId": "task-abc123",
    "status": "working",
    "pollInterval": 5000,
    "message": "Model training started"
  }
}

The client doesn’t wait for the tool to finish. It gets a task ID and a suggested polling interval.

Step 3: Client polls for status

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "tasks/get",
  "params": { "taskId": "task-abc123" }
}

Step 4: When the task completes, fetch the result

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tasks/result",
  "params": { "taskId": "task-abc123" }
}

Task Status Lifecycle

working → completed
working → failed
working → cancelled
working → input_required → working → ...

Tasks can also enter an input_required state, indicating the server needs more information (via elicitation) before proceeding.

Task Operations

Method	Purpose
`tasks/get`	Poll task status
`tasks/result`	Get the final result (blocks until terminal state)
`tasks/list`	List all tasks (paginated)
`tasks/cancel`	Cancel a running task

Tool-Level Task Support

Tools declare their task support in their definition:

{
  "name": "train_model",
  "execution": {
    "taskSupport": "required"
  }
}

Options: "required" (must use tasks), "optional" (tasks available but not required), "forbidden" (tasks not supported).

Tasks are still experimental and may change, but they address a real gap in the protocol for long-running operations.

Elicitation URL Mode

The 2025-11-25 spec also added URL mode to elicitation. The original form mode (described above) is great for simple questions, but it can’t handle sensitive data like passwords or OAuth flows.

URL mode lets the server direct the user to a URL for out-of-band interaction:

{
  "jsonrpc": "2.0",
  "id": "e2",
  "method": "elicitation/create",
  "params": {
    "message": "Please authenticate with your identity provider",
    "requestedSchema": {
      "type": "url",
      "url": "https://auth.example.com/login?session=abc123"
    }
  }
}

The client opens the URL in the user’s browser. The server is notified when the user completes the flow via a notifications/elicitation/complete notification. The sensitive data (credentials, tokens) never passes through the MCP client—it goes directly from the user’s browser to the authentication server.

This is critical for security: form mode MUST NOT be used for passwords, tokens, or other sensitive data.

Summary

MCP’s advanced features transform servers from simple tool wrappers into sophisticated AI integrations:

Sampling — Servers can request LLM completions without their own API keys
Elicitation — Servers can ask the human for input and confirmation (form mode for simple data, URL mode for sensitive flows)
Roots — Servers can discover the client’s workspace context
Completion — Servers can provide autocomplete for arguments
Logging — Servers can send structured diagnostic messages
Progress — Servers can report progress on long operations
Tasks — (Experimental) Durable state machines for long-running operations that survive connection drops

These features are all optional—a simple tool server doesn’t need any of them. But as your servers grow in sophistication, they become increasingly valuable.

Next: making sure it all works.

Keyboard shortcuts

The Model Context Protocol Book