MCP Servers and Clients: The Protocol That Connects AI to the Real World

MCP

Author: Oleh Baranovskyi

Published on Mar 23, 2026

21 MIN READ

MCP Servers and Clients: The Protocol That Connects AI to the Real World
The Problem MCP Solves
Concepts
Architecture Overview
Server & Client Components
Server Side
- Resources
- Tools
- Prompts
Client Side
Transport Layer
JSON-RPC: The Message Format
Real-World Example: Support Agent
Project Structure
Building the Server
- Install Dependencies
- Server Code
Building the Client
- Client Code
MCP Inspector
Lifecycle of a Request
How to Run the Project
Summary

AI models are brilliant at reasoning — but they live in a box. They can't browse your filesystem, query your database, or call your internal APIs out of the box. Every team that wants to connect an LLM to real data ends up writing their own glue code: custom plugins, one-off integrations, and fragile wrappers that break the moment an API changes.

Model Context Protocol (MCP) is the answer to that chaos. It is an open protocol — originally designed by Anthropic and now community-driven — that defines a standard, transport-agnostic way for AI clients (like a coding assistant or an agent framework) to talk to servers that expose data and actions. Think of it as USB-C for AI context: one standard plug that works everywhere.

This article walks through how MCP is structured on both the server and the client side, explains every primitive with a concrete example, and ends with a fully runnable Python implementation of a support-ticket agent.

Imagine you are building an AI-powered customer support agent. The agent needs to:

Read an open ticket from your ticketing system
Look up the customer's account details from a database
Decide whether to escalate or close the ticket
Write back a resolution note

Without a shared protocol every one of those data sources requires a bespoke integration. With MCP, each data source publishes a small server that speaks a well-known language. The agent (the MCP client) connects to those servers and immediately knows how to discover and call everything they offer — no custom glue needed.

Before diving into code, here is a plain-English glossary of every term used in this article.

Concept	Lives On	Purpose
Resource	Server	Expose read-only data (files, DB records, API snapshots)
Tool	Server	Expose callable actions with side effects
Prompt	Server	Expose reusable prompt templates
Root	Client	Tell the server which workspace paths the client owns
Sampling	Client	Let the server request an LLM completion through the client
Elicitation	Client	Let the server ask the user a question through the client UI

MCP Host — the application that embeds an MCP client. Examples: Claude Desktop, VS Code extensions, custom agent frameworks.

Transport — the wire format used to carry messages between client and server. MCP supports stdio (process pipes) and HTTP + SSE (Server-Sent Events) out of the box.

JSON-RPC 2.0 — the message envelope format MCP uses. Every request and response is a tiny JSON object with a method name, parameters, and an ID for pairing replies to requests.

FastMCP — a Python library that dramatically reduces the boilerplate needed to build an MCP server. It auto-generates the protocol scaffolding so you can focus on your business logic.

MCP Inspector — a developer tool (@modelcontextprotocol/inspector) that lets you connect to any MCP server and explore its resources, tools, and prompts interactively without writing a client.

graph TD
    Host["🖥️ MCP Host\n(e.g. Claude Desktop, Agent Framework)"]
    Client["📡 MCP Client\n(embedded in Host)"]
    ServerA["🗄️ MCP Server A\nTicketing System"]
    ServerB["🗄️ MCP Server B\nCustomer Database"]

    Host --> Client
    Client -- "stdio / HTTP+SSE (JSON-RPC)" --> ServerA
    Client -- "stdio / HTTP+SSE (JSON-RPC)" --> ServerB

    ServerA --> R1["📄 Resources\n(ticket data)"]
    ServerA --> T1["🔧 Tools\n(close_ticket, escalate)"]
    ServerA --> P1["💬 Prompts\n(resolution_template)"]

    ServerB --> R2["📄 Resources\n(account records)"]
    ServerB --> T2["🔧 Tools\n(update_account)"]

The host embeds one MCP client instance per server it wants to talk to. Each client opens a transport connection, performs a capability handshake, and then exchanges JSON-RPC messages for the lifetime of the session.

The diagram below groups every primitive by the side that owns it. Each row shows the RPC method the caller uses on the left, and a full description of the primitive — what it is, how it works, and a concrete example — on the right.

flowchart TB
    subgraph SERVER["🗄️ MCP Server — primitives the client calls on it"]
        direction LR
        C1["resources/list\nresources/read"] --> RES["📄 Resources\nRead-only snapshots of data exposed by the server via a URI.\nThe client fetches them to give the model extra context.\ne.g. ticket://42 returns the full text of support ticket #42"]
        C2["tools/list\ntools/call"] --> TOOLS["🔧 Tools\nCallable functions that can have side effects.\nThe model asks the client to invoke them; the server runs the logic.\ne.g. close_ticket(id, resolution) writes to the database"]
        C3["prompts/list\nprompts/get"] --> PROMPTS["💬 Prompts\nReusable message templates stored centrally on the server.\nThe client fetches and injects them into the model's conversation.\ne.g. resolution_prompt returns a pre-written instruction for the LLM"]
    end

    subgraph CLIENT["📡 MCP Client — primitives the server calls back on it"]
        direction LR
        S1["roots/list"] --> ROOTS["🌱 Roots\nFilesystem paths or URIs the client considers its workspace.\nThe server queries them to know which directories it may access.\ne.g. file:///home/user/project tells the server its allowed scope"]
        S2["sampling/createMessage"] --> SAMPLING["🧠 Sampling\nA reverse call: the server asks the client to run an LLM completion.\nUseful when the server needs the model to summarise or classify data\nbefore returning a result — without managing model access itself"]
        S3["elicitation/create"] --> ELICITATION["❓ Elicitation\nA reverse call: the server asks the user a question through the client UI.\nUsed when the server needs a human decision the model cannot make.\ne.g. 'Which calendar should I add this event to?'"]
    end

The two panels reflect the two directions of the protocol. The top panel covers the common case: the client calling into the server to fetch data and invoke actions. The bottom panel covers the reverse direction — the server calling back into the client — which is less obvious but equally important.

An MCP server exposes three kinds of primitives: Resources, Tools, and Prompts.

A resource is a read-only piece of data identified by a URI. Think of it like a REST GET endpoint — no side effects, just data. Resources are ideal for feeding context into the model: open files, database rows, API snapshots.

# Pseudocode — server declares a resource
@server.resource("ticket://{ticket_id}")
def get_ticket(ticket_id: str) -> str:
    ticket = db.query("SELECT * FROM tickets WHERE id = ?", ticket_id)
    return format_as_markdown(ticket)

The client can list all available resource templates with resources/list and then read a specific one with resources/read.

sequenceDiagram
    participant C as MCP Client
    participant S as MCP Server

    C->>S: resources/list
    S-->>C: [{uri: "ticket://{ticket_id}", name: "Support Ticket"}]
    C->>S: resources/read {uri: "ticket://42"}
    S-->>C: {contents: [{text: "# Ticket 42\nUser cannot log in..."}]}

A tool is a callable action — it can have side effects (write to a database, send an email, call an external API). The model asks the client to invoke a tool; the client calls the server; the server runs the logic and returns a result.

# Pseudocode — server declares a tool
@server.tool()
def close_ticket(ticket_id: str, resolution: str) -> dict:
    db.execute(
        "UPDATE tickets SET status='closed', resolution=? WHERE id=?",
        resolution, ticket_id
    )
    return {"success": True, "closed_ticket_id": ticket_id}

Tools are described with a JSON Schema so the model always knows exactly what parameters to pass.

A prompt is a reusable message template stored on the server. Instead of hard-coding instructions in every client, you centralise them on the server so they can be versioned and updated in one place.

# Pseudocode — server declares a prompt
@server.prompt()
def resolution_template(ticket_id: str, tone: str = "professional") -> str:
    return f"Write a {tone} resolution note for ticket {ticket_id}."

The client fetches prompts with prompts/list and prompts/get, then injects the returned messages into the model's conversation.

The client side of MCP is less talked about but equally important. Clients expose three capabilities back to the server: Roots, Sampling, and Elicitation.

Roots tell the server which filesystem paths or URIs the client considers its workspace. This lets a file-system server know which directories it is allowed to read from, without the client having to repeat that information on every request.

# Pseudocode — client declares its roots during initialisation
client.roots = [
    {"uri": "file:///home/user/project", "name": "Current Project"}
]

When the server calls roots/list, the client returns these entries. The server can then scope its resource URIs accordingly.

Sampling allows the server to request an LLM completion through the client. This sounds backwards at first — why would a server need the model? — but it enables powerful patterns like a server that uses the model to auto-summarise retrieved documents before returning them.

# Pseudocode — server requests sampling from the client
response = await session.create_message(
    messages=[{"role": "user", "content": "Summarise this ticket in one sentence."}],
    max_tokens=100
)
summary = response.content.text

The client controls which model is used and can apply its own safety filters before forwarding the completion result back to the server.

Elicitation lets the server ask the user a direct question through the client's UI when it needs clarification that the model cannot provide on its own. For example, a server that manages calendar events might need to ask "Which calendar should I add this to?" before proceeding.

# Pseudocode — server requests user input via elicitation
answer = await session.elicit(
    message="Which priority should this ticket be assigned?",
    schema={"type": "string", "enum": ["low", "medium", "high"]}
)

The client surfaces the question in its UI, collects the user's answer, and sends it back to the server as a structured value.

MCP is transport-agnostic. The two standard options are:

stdio — the client spawns the server as a child process and communicates via standard input/output. This is the simplest option and works great for local tools.

HTTP + SSE — the server runs as an HTTP service. The client sends requests over HTTP POST and receives streaming responses via Server-Sent Events. This is the right choice for remote or shared servers.

graph LR
    subgraph Local["Local (stdio)"]
        C1[Client] -- "stdin/stdout\nJSON-RPC" --> S1[Server Process]
    end

    subgraph Remote["Remote (HTTP + SSE)"]
        C2[Client] -- "HTTP POST\n/messages" --> S2[HTTP Server]
        S2 -- "SSE stream\n/events" --> C2
    end

Every MCP message is a JSON-RPC 2.0 envelope. JSON-RPC is not specific to MCP — it is a tiny, well-understood standard that defines three message shapes:

Request — the caller sends a method name, parameters, and a unique ID.

{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
        "name": "close_ticket",
        "arguments": { "ticket_id": "42", "resolution": "Password reset completed." }
    }
}

Response — the receiver sends back the result (or an error) with the same ID.

{
    "jsonrpc": "2.0",
    "id": 1,
    "result": { "success": true, "closed_ticket_id": "42" }
}

Notification — a one-way message with no ID and no expected reply (used for events like notifications/resources/updated).

That's the entire format. MCP builds its entire vocabulary of methods (resources/list, tools/call, prompts/get, etc.) on top of these three shapes.

The scenario: a support agent that reads open tickets, fetches the customer's account tier, and closes tickets with an AI-generated resolution note.

sequenceDiagram
    participant Agent as Agent (LLM)
    participant Client as MCP Client
    participant Server as MCP Server (Ticketing)

    Agent->>Client: Read ticket 42
    Client->>Server: resources/read {uri: "ticket://42"}
    Server-->>Client: Ticket content (markdown)
    Client-->>Agent: Ticket context injected

    Agent->>Client: Call close_ticket(42, "Password reset completed.")
    Client->>Server: tools/call {name: "close_ticket", arguments: {...}}
    Server-->>Client: {success: true}
    Client-->>Agent: Tool result returned

mcp-support-agent/
├── server/
│   ├── server.py          # MCP server — exposes ticket resources and tools
│   └── fake_db.py         # In-memory fake ticket database
├── client/
│   └── client.py          # MCP client — connects to server, runs agent loop
├── requirements.txt       # Python dependencies
└── README.md

pip install fastmcp mcp

fastmcp is used on the server side — it wraps the lower-level mcp SDK with a decorator-driven API that eliminates most of the protocol boilerplate. The client uses the mcp SDK directly, which gives you full control over the session lifecycle.

server/fake_db.py — a simple in-memory store so the example is self-contained:

# server/fake_db.py

TICKETS: dict[str, dict] = {
    "42": {
        "id": "42",
        "subject": "Cannot log in after password change",
        "customer": "alice@example.com",
        "account_tier": "pro",
        "status": "open",
        "resolution": None,
    },
    "43": {
        "id": "43",
        "subject": "Invoice PDF not loading",
        "customer": "bob@example.com",
        "account_tier": "free",
        "status": "open",
        "resolution": None,
    },
}


def get_ticket(ticket_id: str) -> dict | None:
    return TICKETS.get(ticket_id)


def list_open_tickets() -> list[dict]:
    return [t for t in TICKETS.values() if t["status"] == "open"]


def close_ticket(ticket_id: str, resolution: str) -> bool:
    ticket = TICKETS.get(ticket_id)
    if not ticket:
        return False
    ticket["status"] = "closed"
    ticket["resolution"] = resolution
    return True

server/server.py — the MCP server built with FastMCP:

# server/server.py

import json
import sys
import os
from pydantic import BaseModel
from mcp.server.fastmcp import FastMCP, Context
from mcp.types import SamplingMessage, TextContent

# Ensure fake_db is importable when the server is launched as a subprocess.
sys.path.insert(0, os.path.dirname(__file__))
import fake_db

# FastMCP creates the server and handles all protocol scaffolding.
mcp = FastMCP("support-ticketing-server")


# ── Elicitation schema ────────────────────────────────────────────────────────
#
# ❓ ELICITATION — Elicitation lets the server ask the user a structured
# question through the client's UI. The server defines a Pydantic model that
# describes the expected answer shape. FastMCP validates the user's response
# against this schema before returning it to the server.
#
# This model is used inside escalate_ticket to ask the user to confirm the
# escalation priority before the ticket is flagged.

class PriorityInput(BaseModel):
    priority: str  # expected: "low", "medium", or "high"


# ── Resources ────────────────────────────────────────────────────────────────

@mcp.resource("ticket://list", name="Open tickets list", description="Lists all open support tickets")
def resource_list_open_tickets() -> str:
    # 🌱 ROOTS — Roots are filesystem paths or URIs the client declares as its
    # workspace during the initialize() handshake. When needed, the server can
    # call ctx.request_context.session.list_roots() from inside a Context-aware
    # function to discover them. Resource functions without a Context parameter
    # can't make that call directly — roots are available in tools and prompts
    # that accept ctx: Context.
    tickets = fake_db.list_open_tickets()
    lines = [f"#{t['id']} [{t['status']}] {t['subject']} ({t['customer']})" for t in tickets]
    return "\n".join(lines) if lines else "No open tickets."


@mcp.resource("ticket://{ticket_id}", name="Ticket details", description="Full details for a single ticket")
def resource_get_ticket(ticket_id: str) -> str:
    ticket = fake_db.get_ticket(ticket_id)
    if not ticket:
        return f"Ticket #{ticket_id} not found."
    return json.dumps(ticket, indent=2)


# ── Tools ────────────────────────────────────────────────────────────────────

@mcp.tool(description="Close a support ticket with a resolution message")
def close_ticket_tool(ticket_id: str, resolution: str) -> str:
    """
    Closes a ticket with a resolution note.

    Args:
        ticket_id: The ID of the ticket to close.
        resolution: A plain-text description of how the issue was resolved.
    """
    success = fake_db.close_ticket(ticket_id, resolution)
    if success:
        return json.dumps({"success": True, "ticket_id": ticket_id})
    return json.dumps({"success": False, "error": f"Ticket #{ticket_id} not found."})


@mcp.tool(description="Escalate a support ticket, asking the client for priority via elicitation and generating a summary via sampling")
async def escalate_ticket(ticket_id: str, reason: str, ctx: Context) -> str:
    """
    Flags a ticket for human escalation.

    Args:
        ticket_id: The ticket to escalate.
        reason: Why this ticket needs human attention.
    """
    ticket = fake_db.get_ticket(ticket_id)
    if not ticket:
        return json.dumps({"success": False, "error": f"Ticket #{ticket_id} not found."})

    # ❓ ELICITATION — The server pauses execution and sends a structured
    # question to the user through the client's UI. The client collects the
    # answer, validates it against PriorityInput, and returns it here.
    # result.action is "accept", "decline", or "cancel".
    # result.data holds the validated PriorityInput instance when accepted.
    elicit_result = await ctx.elicit(
        message=f"Ticket #{ticket_id} needs escalation. What priority should it be set to? (low / medium / high)",
        schema=PriorityInput,
    )
    priority = elicit_result.data.priority if elicit_result.action == "accept" and elicit_result.data else "medium"

    # 🧠 SAMPLING — The server asks the client to run an LLM completion on
    # its behalf. The client controls which model is used and applies its own
    # safety filters. The server receives back the generated text.
    # Here we use sampling to draft a human-readable escalation summary that
    # could be posted to an internal queue or sent as a notification.
    sampling_result = await ctx.request_context.session.create_message(
        messages=[
            SamplingMessage(
                role="user",
                content=TextContent(
                    type="text",
                    text=(
                        f"Summarise the following escalation in one sentence for an internal ops note.\n"
                        f"Ticket #{ticket_id}: {ticket['subject']}\n"
                        f"Reason: {reason}"
                    ),
                ),
            )
        ],
        max_tokens=128,
    )
    note_content = sampling_result.content
    escalation_note = note_content.text if hasattr(note_content, "text") else str(note_content)

    return json.dumps({
        "success": True,
        "ticket_id": ticket_id,
        "priority": priority,
        "escalation_note": escalation_note,
    })


# ── Prompts ──────────────────────────────────────────────────────────────────

@mcp.prompt(description="Generate a resolution message for a support ticket")
def resolution_prompt(ticket_id: str, account_tier: str) -> str:
    """
    Returns a prompt template for generating a resolution note.

    Args:
        ticket_id: The ticket being resolved.
        account_tier: 'free' or 'pro' — adjust tone accordingly.
    """
    ticket = fake_db.get_ticket(ticket_id)
    if not ticket:
        return f"No ticket found with ID {ticket_id}."
    return (
        f"You are a support agent. Write a professional resolution email for the following ticket.\n\n"
        f"Ticket #{ticket['id']}: {ticket['subject']}\n"
        f"Customer: {ticket['customer']} (account tier: {account_tier})\n\n"
        f"Be concise, empathetic, and provide clear next steps."
    )


if __name__ == "__main__":
    mcp.run()

# client/client.py

import asyncio
import json
from typing import Any
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from mcp.shared.context import RequestContext
from mcp.types import (
    CreateMessageRequestParams,
    CreateMessageResult,
    TextContent,
    ElicitRequestParams,
    ElicitResult,
)


# 🧠 SAMPLING HANDLER — Sampling is a reverse call: the server asks the client
# to run an LLM completion on its behalf. The client registers this handler so
# that when server.py calls ctx.request_context.session.create_message(), the
# request is routed here. The client controls which model runs — the server
# never touches API keys. In this example we simulate the LLM response with a
# fixed string; in production you would call OpenAI, Anthropic, or any other
# provider here.
async def sampling_handler(
    context: RequestContext[ClientSession, Any],
    request: CreateMessageRequestParams,
) -> CreateMessageResult:
    prompt_text = " ".join(
        m.content.text for m in request.messages if m.content.type == "text"
    )
    print(f"\n🧠 [Sampling] Server requested LLM completion for: {prompt_text!r}")

    # Simulated LLM response — replace with a real model call in production.
    simulated_response = "Ticket escalated due to login failure; requires urgent review."
    print(f"   → Simulated response: {simulated_response!r}")

    return CreateMessageResult(
        role="assistant",
        content=TextContent(type="text", text=simulated_response),
        model="simulated-model",
        stopReason="endTurn",
    )


# ❓ ELICITATION HANDLER — Elicitation is a reverse call: the server pauses
# execution and asks the user a question through the client's UI. The client
# registers this handler so that when server.py calls ctx.elicit(), the request
# is routed here. The handler collects the user's answer and sends it back to
# the server as structured data matching the schema the server declared.
# In this example we simulate the user picking "high" priority.
async def elicitation_handler(
    context: RequestContext[ClientSession, Any],
    request: ElicitRequestParams,
) -> ElicitResult:
    print(f"\n❓ [Elicitation] Server asks: {request.message!r}")

    # In a real client this would open a dialog or prompt the user in the UI.
    # Here we simulate the user choosing "high" priority.
    user_input = "high"
    print(f"   → Simulated user input: {user_input!r}")

    return ElicitResult(
        action="accept",
        content={"priority": user_input},
    )


SERVER_SCRIPT = "../server/server.py"


async def run_support_agent():
    """
    Agent loop that demonstrates all three client-side MCP primitives:
      🌱 Roots       — negotiated at initialisation so the server knows our workspace.
      🧠 Sampling    — handled by sampling_handler when the server requests LLM completions.
      ❓ Elicitation — handled by elicitation_handler when the server needs user input.
    """
    server_params = StdioServerParameters(
        command="python",
        args=[SERVER_SCRIPT],
    )

    async with stdio_client(server_params) as (read_stream, write_stream):
        async with ClientSession(
            read_stream,
            write_stream,
            # 🧠 SAMPLING — register the handler the server will call when it needs
            # an LLM completion. Without this, ctx.session.create_message() on
            # the server would fail with a capability-not-supported error.
            sampling_callback=sampling_handler,
            # ❓ ELICITATION — register the handler the server will call when it
            # needs to ask the user a structured question. Without this,
            # ctx.elicit() on the server would fail.
            elicitation_callback=elicitation_handler,
        ) as session:
            # ── Handshake ────────────────────────────────────────────────────
            # 🌱 ROOTS — Roots are negotiated as part of client capabilities
            # during the initialize() handshake. The server can then call
            # ctx.session.list_roots() at any time to discover which workspace
            # paths this client considers its own.
            await session.initialize()
            print("✅ Connected to MCP server\n")

            # ── Discover capabilities ────────────────────────────────────────
            tools_result = await session.list_tools()
            print("🔧 Available tools:")
            for tool in tools_result.tools:
                print(f"   • {tool.name}: {tool.description}")

            resources_result = await session.list_resources()
            print("\n📄 Available resources:")
            for res in resources_result.resources:
                print(f"   • {res.uri}: {res.name}")

            prompts_result = await session.list_prompts()
            print("\n💬 Available prompts:")
            for prompt in prompts_result.prompts:
                print(f"   • {prompt.name}: {prompt.description}")

            print()

            # ── Step 1: Read the open tickets list resource ──────────────────
            # This triggers resource_list_open_tickets on the server, which
            # calls ctx.session.list_roots() — the roots flow in action.
            list_resource = await session.read_resource("ticket://list")
            print("📋 Open tickets:\n")
            for content in list_resource.contents:
                print(content.text)

            # ── Step 2: Read a specific ticket resource ──────────────────────
            ticket_id = "42"
            ticket_resource = await session.read_resource(f"ticket://{ticket_id}")
            print(f"\n📝 Full details for ticket #{ticket_id}:\n")
            for content in ticket_resource.contents:
                print(content.text)

            # ── Step 3: Fetch the resolution prompt template ─────────────────
            prompt_result = await session.get_prompt(
                "resolution_prompt",
                arguments={"ticket_id": ticket_id, "account_tier": "pro"},
            )
            print("\n💬 Prompt template from server:")
            for msg in prompt_result.messages:
                print(f"   [{msg.role}] {msg.content.text}")

            # ── Step 4: Call the close_ticket tool ───────────────────────────
            # In a real agent the resolution text would come from an LLM call
            # using the prompt template above. Here we use a fixed string.
            resolution = (
                "Hi, we have reset your authentication token and confirmed "
                "your login is working again. Please reach out if you run into "
                "any further issues. — Support Team"
            )
            print(f"\n🔧 Closing ticket #{ticket_id}...")
            tool_result = await session.call_tool(
                "close_ticket_tool",
                arguments={"ticket_id": ticket_id, "resolution": resolution},
            )
            result_data = json.loads(tool_result.content[0].text)
            if result_data.get("success"):
                print(f"✅ Ticket #{result_data['ticket_id']} closed successfully.")
            else:
                print(f"❌ Error: {result_data.get('error')}")

            # ── Step 5: Call the escalate_ticket tool ──────────────────────────────────────
            # Calling this triggers both elicitation and sampling on the server:
            #   ❓ Elicitation: server calls ctx.elicit() → routed to elicitation_handler
            #   🧠 Sampling:    server calls ctx.session.create_message() → routed to sampling_handler
            print(f"\n🚨 Escalating ticket #43...")
            escalate_result = await session.call_tool(
                "escalate_ticket",
                arguments={
                    "ticket_id": "43",
                    "reason": "Customer unable to access invoice PDF after multiple attempts.",
                },
            )
            escalate_data = json.loads(escalate_result.content[0].text)
            if escalate_data.get("success"):
                print(f"✅ Ticket #43 escalated at {escalate_data['priority']} priority.")
                print(f"   Note: {escalate_data['escalation_note']}")
            else:
                print(f"❌ Error: {escalate_data.get('error')}")


if __name__ == "__main__":
    asyncio.run(run_support_agent())

Run the agent from the client/ directory:

python client.py

You should see the agent connect, discover capabilities, read tickets, close one, and escalate another — with the elicitation and sampling flows printing inline.

Before writing a client, you can explore any MCP server interactively using the MCP Inspector:

npx @modelcontextprotocol/inspector python server/server.py

This opens a browser UI where you can:

Browse all resources and read their contents
Browse all tools, fill in arguments, and invoke them
Browse all prompts, supply arguments, and preview the rendered messages

The Inspector is invaluable during development — it lets you verify that your server exposes exactly the primitives you expect before you wire up a client or an LLM.

graph LR
    Inspector["🔍 MCP Inspector\n(browser UI)"] -- "stdio" --> Server["🗄️ MCP Server\n(your server.py)"]
    Inspector --> R["Browse Resources"]
    Inspector --> T["Invoke Tools"]
    Inspector --> P["Preview Prompts"]

Putting everything together, here is the full lifecycle of a single agent action from the moment the LLM decides to close a ticket to the moment the result is returned:

sequenceDiagram
    participant LLM as LLM (Agent Brain)
    participant Host as Host Application
    participant Client as MCP Client
    participant Server as MCP Server

    LLM->>Host: "Call close_ticket_tool with ticket_id=42"
    Host->>Client: Translate decision to tool call
    Client->>Server: JSON-RPC tools/call\n{name: "close_ticket_tool", arguments: {...}}
    Server->>Server: Execute close_ticket() in fake_db
    Server-->>Client: JSON-RPC response\n{success: true, closed_ticket_id: "42"}
    Client-->>Host: Tool result
    Host-->>LLM: "Tool returned: {success: true}"
    LLM->>Host: "Ticket closed. Inform the user."

Follow these steps to run the example from scratch.

1. Clone or create the project structure

mcp-support-agent/
├── server/
│   ├── server.py
│   └── fake_db.py
├── client/
│   └── client.py
└── requirements.txt

2. Create and activate a virtual environment

python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate

3. Install dependencies

pip install -r requirements.txt

Your requirements.txt should contain:

fastmcp
mcp

4. Run the client

The client spawns the server automatically as a subprocess via stdio — you only need to run one command:

cd client
python client.py

You should see output like this:

✅ Connected to MCP server

🔧 Available tools:
   • close_ticket_tool: Close a support ticket with a resolution message
   • escalate_ticket: Escalate a support ticket...

📄 Available resources:
   • ticket://list: Open tickets list

💬 Available prompts:
   • resolution_prompt: Generate a resolution message for a support ticket

📋 Open tickets:
#42 [open] Cannot log in after password change (alice@example.com)
#43 [open] Invoice PDF not loading (bob@example.com)

📝 Full details for ticket #42: ...

💬 Prompt template from server: ...

🔧 Closing ticket #42...
✅ Ticket #42 closed successfully.

🚨 Escalating ticket #43...

❓ [Elicitation] Server asks: 'Ticket #43 needs escalation...'
   → Simulated user input: 'high'

🧠 [Sampling] Server requested LLM completion for: '...'
   → Simulated response: 'Ticket escalated due to login failure; requires urgent review.'

✅ Ticket #43 escalated at high priority.
   Note: Ticket escalated due to login failure; requires urgent review.

MCP is a clean, protocol-level answer to the messy problem of connecting AI models to real-world data and actions. The key ideas to take away:

Servers expose three primitives: Resources (read-only data), Tools (callable actions), and Prompts (reusable message templates).
Clients expose three capabilities back: Roots (workspace scope), Sampling (let the server request LLM completions), and Elicitation (let the server ask the user a question).
All communication is JSON-RPC 2.0 over stdio or HTTP + SSE — simple, well-understood, and easy to debug.
FastMCP removes most of the protocol boilerplate on the server side, letting you focus on business logic. The client uses the lower-level mcp SDK directly.
MCP Inspector gives you a browser UI to explore and test any server before writing a single line of client code.

Once you have an MCP server running for a data source, every MCP-compatible host — Claude Desktop, custom agents, IDE extensions — can consume it immediately without any additional integration work. That is the real power of a shared protocol.

Table of contents

MCP Servers and Clients: The Protocol That Connects AI to the Real World

The Problem MCP Solves

Concepts

Architecture Overview

Server & Client Components

Server Side

Resources

Tools

Prompts

Client Side

Roots

Sampling

Elicitation

Transport Layer

JSON-RPC: The Message Format

Real-World Example: Support Agent

Project Structure

Building the Server

Install Dependencies

Server Code

Building the Client

Client Code

MCP Inspector

Lifecycle of a Request

How to Run the Project

Summary