Table of contents
- MCP Servers and Clients: The Protocol That Connects AI to the Real World
- The Problem MCP Solves
- Concepts
- Architecture Overview
- Server & Client Components
- Server Side
- Client Side
- Transport Layer
- JSON-RPC: The Message Format
- Real-World Example: Support Agent
- Project Structure
- Building the Server
- Building the Client
- MCP Inspector
- Lifecycle of a Request
- How to Run the Project
- Summary
MCP Servers and Clients: The Protocol That Connects AI to the Real World
AI models are brilliant at reasoning โ but they live in a box. They can't browse your filesystem, query your database, or call your internal APIs out of the box. Every team that wants to connect an LLM to real data ends up writing their own glue code: custom plugins, one-off integrations, and fragile wrappers that break the moment an API changes.
Model Context Protocol (MCP) is the answer to that chaos. It is an open protocol โ originally designed by Anthropic and now community-driven โ that defines a standard, transport-agnostic way for AI clients (like a coding assistant or an agent framework) to talk to servers that expose data and actions. Think of it as USB-C for AI context: one standard plug that works everywhere.
This article walks through how MCP is structured on both the server and the client side, explains every primitive with a concrete example, and ends with a fully runnable Python implementation of a support-ticket agent.
The Problem MCP Solves
Imagine you are building an AI-powered customer support agent. The agent needs to:
- Read an open ticket from your ticketing system
- Look up the customer's account details from a database
- Decide whether to escalate or close the ticket
- Write back a resolution note
Without a shared protocol every one of those data sources requires a bespoke integration. With MCP, each data source publishes a small server that speaks a well-known language. The agent (the MCP client) connects to those servers and immediately knows how to discover and call everything they offer โ no custom glue needed.
Concepts
Before diving into code, here is a plain-English glossary of every term used in this article.
| Concept | Lives On | Purpose |
|---|---|---|
| Resource | Server | Expose read-only data (files, DB records, API snapshots) |
| Tool | Server | Expose callable actions with side effects |
| Prompt | Server | Expose reusable prompt templates |
| Root | Client | Tell the server which workspace paths the client owns |
| Sampling | Client | Let the server request an LLM completion through the client |
| Elicitation | Client | Let the server ask the user a question through the client UI |
MCP Host โ the application that embeds an MCP client. Examples: Claude Desktop, VS Code extensions, custom agent frameworks.
Transport โ the wire format used to carry messages between client and server. MCP supports stdio (process pipes) and HTTP + SSE (Server-Sent Events) out of the box.
JSON-RPC 2.0 โ the message envelope format MCP uses. Every request and response is a tiny JSON object with a method name, parameters, and an ID for pairing replies to requests.
FastMCP โ a Python library that dramatically reduces the boilerplate needed to build an MCP server. It auto-generates the protocol scaffolding so you can focus on your business logic.
MCP Inspector โ a developer tool (@modelcontextprotocol/inspector) that lets you connect to any MCP server and explore its resources, tools, and prompts interactively without writing a client.
Architecture Overview
graph TD
Host["๐ฅ๏ธ MCP Host\n(e.g. Claude Desktop, Agent Framework)"]
Client["๐ก MCP Client\n(embedded in Host)"]
ServerA["๐๏ธ MCP Server A\nTicketing System"]
ServerB["๐๏ธ MCP Server B\nCustomer Database"]
Host --> Client
Client -- "stdio / HTTP+SSE (JSON-RPC)" --> ServerA
Client -- "stdio / HTTP+SSE (JSON-RPC)" --> ServerB
ServerA --> R1["๐ Resources\n(ticket data)"]
ServerA --> T1["๐ง Tools\n(close_ticket, escalate)"]
ServerA --> P1["๐ฌ Prompts\n(resolution_template)"]
ServerB --> R2["๐ Resources\n(account records)"]
ServerB --> T2["๐ง Tools\n(update_account)"]
The host embeds one MCP client instance per server it wants to talk to. Each client opens a transport connection, performs a capability handshake, and then exchanges JSON-RPC messages for the lifetime of the session.
Server & Client Components
The diagram below groups every primitive by the side that owns it. Each row shows the RPC method the caller uses on the left, and a full description of the primitive โ what it is, how it works, and a concrete example โ on the right.
flowchart TB
subgraph SERVER["๐๏ธ MCP Server โ primitives the client calls on it"]
direction LR
C1["resources/list\nresources/read"] --> RES["๐ Resources\nRead-only snapshots of data exposed by the server via a URI.\nThe client fetches them to give the model extra context.\ne.g. ticket://42 returns the full text of support ticket #42"]
C2["tools/list\ntools/call"] --> TOOLS["๐ง Tools\nCallable functions that can have side effects.\nThe model asks the client to invoke them; the server runs the logic.\ne.g. close_ticket(id, resolution) writes to the database"]
C3["prompts/list\nprompts/get"] --> PROMPTS["๐ฌ Prompts\nReusable message templates stored centrally on the server.\nThe client fetches and injects them into the model's conversation.\ne.g. resolution_prompt returns a pre-written instruction for the LLM"]
end
subgraph CLIENT["๐ก MCP Client โ primitives the server calls back on it"]
direction LR
S1["roots/list"] --> ROOTS["๐ฑ Roots\nFilesystem paths or URIs the client considers its workspace.\nThe server queries them to know which directories it may access.\ne.g. file:///home/user/project tells the server its allowed scope"]
S2["sampling/createMessage"] --> SAMPLING["๐ง Sampling\nA reverse call: the server asks the client to run an LLM completion.\nUseful when the server needs the model to summarise or classify data\nbefore returning a result โ without managing model access itself"]
S3["elicitation/create"] --> ELICITATION["โ Elicitation\nA reverse call: the server asks the user a question through the client UI.\nUsed when the server needs a human decision the model cannot make.\ne.g. 'Which calendar should I add this event to?'"]
end
The two panels reflect the two directions of the protocol. The top panel covers the common case: the client calling into the server to fetch data and invoke actions. The bottom panel covers the reverse direction โ the server calling back into the client โ which is less obvious but equally important.
Server Side
An MCP server exposes three kinds of primitives: Resources, Tools, and Prompts.
Resources
A resource is a read-only piece of data identified by a URI. Think of it like a REST GET endpoint โ no side effects, just data. Resources are ideal for feeding context into the model: open files, database rows, API snapshots.
# Pseudocode โ server declares a resource
@server.resource("ticket://{ticket_id}")
def get_ticket(ticket_id: str) -> str:
ticket = db.query("SELECT * FROM tickets WHERE id = ?", ticket_id)
return format_as_markdown(ticket)
The client can list all available resource templates with resources/list and then read a specific one with resources/read.
sequenceDiagram
participant C as MCP Client
participant S as MCP Server
C->>S: resources/list
S-->>C: [{uri: "ticket://{ticket_id}", name: "Support Ticket"}]
C->>S: resources/read {uri: "ticket://42"}
S-->>C: {contents: [{text: "# Ticket 42\nUser cannot log in..."}]}
Tools
A tool is a callable action โ it can have side effects (write to a database, send an email, call an external API). The model asks the client to invoke a tool; the client calls the server; the server runs the logic and returns a result.
# Pseudocode โ server declares a tool
@server.tool()
def close_ticket(ticket_id: str, resolution: str) -> dict:
db.execute(
"UPDATE tickets SET status='closed', resolution=? WHERE id=?",
resolution, ticket_id
)
return {"success": True, "closed_ticket_id": ticket_id}
Tools are described with a JSON Schema so the model always knows exactly what parameters to pass.
Prompts
A prompt is a reusable message template stored on the server. Instead of hard-coding instructions in every client, you centralise them on the server so they can be versioned and updated in one place.
# Pseudocode โ server declares a prompt
@server.prompt()
def resolution_template(ticket_id: str, tone: str = "professional") -> str:
return f"Write a {tone} resolution note for ticket {ticket_id}."
The client fetches prompts with prompts/list and prompts/get, then injects the returned messages into the model's conversation.
Client Side
The client side of MCP is less talked about but equally important. Clients expose three capabilities back to the server: Roots, Sampling, and Elicitation.
Roots
Roots tell the server which filesystem paths or URIs the client considers its workspace. This lets a file-system server know which directories it is allowed to read from, without the client having to repeat that information on every request.
# Pseudocode โ client declares its roots during initialisation
client.roots = [
{"uri": "file:///home/user/project", "name": "Current Project"}
]
When the server calls roots/list, the client returns these entries. The server can then scope its resource URIs accordingly.
Sampling
Sampling allows the server to request an LLM completion through the client. This sounds backwards at first โ why would a server need the model? โ but it enables powerful patterns like a server that uses the model to auto-summarise retrieved documents before returning them.
# Pseudocode โ server requests sampling from the client
response = await session.create_message(
messages=[{"role": "user", "content": "Summarise this ticket in one sentence."}],
max_tokens=100
)
summary = response.content.text
The client controls which model is used and can apply its own safety filters before forwarding the completion result back to the server.
Elicitation
Elicitation lets the server ask the user a direct question through the client's UI when it needs clarification that the model cannot provide on its own. For example, a server that manages calendar events might need to ask "Which calendar should I add this to?" before proceeding.
# Pseudocode โ server requests user input via elicitation
answer = await session.elicit(
message="Which priority should this ticket be assigned?",
schema={"type": "string", "enum": ["low", "medium", "high"]}
)
The client surfaces the question in its UI, collects the user's answer, and sends it back to the server as a structured value.
Transport Layer
MCP is transport-agnostic. The two standard options are:
stdio โ the client spawns the server as a child process and communicates via standard input/output. This is the simplest option and works great for local tools.
HTTP + SSE โ the server runs as an HTTP service. The client sends requests over HTTP POST and receives streaming responses via Server-Sent Events. This is the right choice for remote or shared servers.
graph LR
subgraph Local["Local (stdio)"]
C1[Client] -- "stdin/stdout\nJSON-RPC" --> S1[Server Process]
end
subgraph Remote["Remote (HTTP + SSE)"]
C2[Client] -- "HTTP POST\n/messages" --> S2[HTTP Server]
S2 -- "SSE stream\n/events" --> C2
end
JSON-RPC: The Message Format
Every MCP message is a JSON-RPC 2.0 envelope. JSON-RPC is not specific to MCP โ it is a tiny, well-understood standard that defines three message shapes:
Request โ the caller sends a method name, parameters, and a unique ID.
{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "close_ticket",
"arguments": { "ticket_id": "42", "resolution": "Password reset completed." }
}
}
Response โ the receiver sends back the result (or an error) with the same ID.
{
"jsonrpc": "2.0",
"id": 1,
"result": { "success": true, "closed_ticket_id": "42" }
}
Notification โ a one-way message with no ID and no expected reply (used for events like notifications/resources/updated).
That's the entire format. MCP builds its entire vocabulary of methods (resources/list, tools/call, prompts/get, etc.) on top of these three shapes.
Real-World Example: Support Agent
The scenario: a support agent that reads open tickets, fetches the customer's account tier, and closes tickets with an AI-generated resolution note.
sequenceDiagram
participant Agent as Agent (LLM)
participant Client as MCP Client
participant Server as MCP Server (Ticketing)
Agent->>Client: Read ticket 42
Client->>Server: resources/read {uri: "ticket://42"}
Server-->>Client: Ticket content (markdown)
Client-->>Agent: Ticket context injected
Agent->>Client: Call close_ticket(42, "Password reset completed.")
Client->>Server: tools/call {name: "close_ticket", arguments: {...}}
Server-->>Client: {success: true}
Client-->>Agent: Tool result returned
Project Structure
mcp-support-agent/
โโโ server/
โ โโโ server.py # MCP server โ exposes ticket resources and tools
โ โโโ fake_db.py # In-memory fake ticket database
โโโ client/
โ โโโ client.py # MCP client โ connects to server, runs agent loop
โโโ requirements.txt # Python dependencies
โโโ README.md
Building the Server
Install Dependencies
pip install fastmcp mcp
fastmcp is used on the server side โ it wraps the lower-level mcp SDK with a decorator-driven API that eliminates most of the protocol boilerplate. The client uses the mcp SDK directly, which gives you full control over the session lifecycle.
Server Code
server/fake_db.py โ a simple in-memory store so the example is self-contained:
# server/fake_db.py
TICKETS: dict[str, dict] = {
"42": {
"id": "42",
"subject": "Cannot log in after password change",
"customer": "alice@example.com",
"account_tier": "pro",
"status": "open",
"resolution": None,
},
"43": {
"id": "43",
"subject": "Invoice PDF not loading",
"customer": "bob@example.com",
"account_tier": "free",
"status": "open",
"resolution": None,
},
}
def get_ticket(ticket_id: str) -> dict | None:
return TICKETS.get(ticket_id)
def list_open_tickets() -> list[dict]:
return [t for t in TICKETS.values() if t["status"] == "open"]
def close_ticket(ticket_id: str, resolution: str) -> bool:
ticket = TICKETS.get(ticket_id)
if not ticket:
return False
ticket["status"] = "closed"
ticket["resolution"] = resolution
return True
server/server.py โ the MCP server built with FastMCP:
# server/server.py
import json
import sys
import os
from pydantic import BaseModel
from mcp.server.fastmcp import FastMCP, Context
from mcp.types import SamplingMessage, TextContent
# Ensure fake_db is importable when the server is launched as a subprocess.
sys.path.insert(0, os.path.dirname(__file__))
import fake_db
# FastMCP creates the server and handles all protocol scaffolding.
mcp = FastMCP("support-ticketing-server")
# โโ Elicitation schema โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
#
# โ ELICITATION โ Elicitation lets the server ask the user a structured
# question through the client's UI. The server defines a Pydantic model that
# describes the expected answer shape. FastMCP validates the user's response
# against this schema before returning it to the server.
#
# This model is used inside escalate_ticket to ask the user to confirm the
# escalation priority before the ticket is flagged.
class PriorityInput(BaseModel):
priority: str # expected: "low", "medium", or "high"
# โโ Resources โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
@mcp.resource("ticket://list", name="Open tickets list", description="Lists all open support tickets")
def resource_list_open_tickets() -> str:
# ๐ฑ ROOTS โ Roots are filesystem paths or URIs the client declares as its
# workspace during the initialize() handshake. When needed, the server can
# call ctx.request_context.session.list_roots() from inside a Context-aware
# function to discover them. Resource functions without a Context parameter
# can't make that call directly โ roots are available in tools and prompts
# that accept ctx: Context.
tickets = fake_db.list_open_tickets()
lines = [f"#{t['id']} [{t['status']}] {t['subject']} ({t['customer']})" for t in tickets]
return "\n".join(lines) if lines else "No open tickets."
@mcp.resource("ticket://{ticket_id}", name="Ticket details", description="Full details for a single ticket")
def resource_get_ticket(ticket_id: str) -> str:
ticket = fake_db.get_ticket(ticket_id)
if not ticket:
return f"Ticket #{ticket_id} not found."
return json.dumps(ticket, indent=2)
# โโ Tools โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
@mcp.tool(description="Close a support ticket with a resolution message")
def close_ticket_tool(ticket_id: str, resolution: str) -> str:
"""
Closes a ticket with a resolution note.
Args:
ticket_id: The ID of the ticket to close.
resolution: A plain-text description of how the issue was resolved.
"""
success = fake_db.close_ticket(ticket_id, resolution)
if success:
return json.dumps({"success": True, "ticket_id": ticket_id})
return json.dumps({"success": False, "error": f"Ticket #{ticket_id} not found."})
@mcp.tool(description="Escalate a support ticket, asking the client for priority via elicitation and generating a summary via sampling")
async def escalate_ticket(ticket_id: str, reason: str, ctx: Context) -> str:
"""
Flags a ticket for human escalation.
Args:
ticket_id: The ticket to escalate.
reason: Why this ticket needs human attention.
"""
ticket = fake_db.get_ticket(ticket_id)
if not ticket:
return json.dumps({"success": False, "error": f"Ticket #{ticket_id} not found."})
# โ ELICITATION โ The server pauses execution and sends a structured
# question to the user through the client's UI. The client collects the
# answer, validates it against PriorityInput, and returns it here.
# result.action is "accept", "decline", or "cancel".
# result.data holds the validated PriorityInput instance when accepted.
elicit_result = await ctx.elicit(
message=f"Ticket #{ticket_id} needs escalation. What priority should it be set to? (low / medium / high)",
schema=PriorityInput,
)
priority = elicit_result.data.priority if elicit_result.action == "accept" and elicit_result.data else "medium"
# ๐ง SAMPLING โ The server asks the client to run an LLM completion on
# its behalf. The client controls which model is used and applies its own
# safety filters. The server receives back the generated text.
# Here we use sampling to draft a human-readable escalation summary that
# could be posted to an internal queue or sent as a notification.
sampling_result = await ctx.request_context.session.create_message(
messages=[
SamplingMessage(
role="user",
content=TextContent(
type="text",
text=(
f"Summarise the following escalation in one sentence for an internal ops note.\n"
f"Ticket #{ticket_id}: {ticket['subject']}\n"
f"Reason: {reason}"
),
),
)
],
max_tokens=128,
)
note_content = sampling_result.content
escalation_note = note_content.text if hasattr(note_content, "text") else str(note_content)
return json.dumps({
"success": True,
"ticket_id": ticket_id,
"priority": priority,
"escalation_note": escalation_note,
})
# โโ Prompts โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
@mcp.prompt(description="Generate a resolution message for a support ticket")
def resolution_prompt(ticket_id: str, account_tier: str) -> str:
"""
Returns a prompt template for generating a resolution note.
Args:
ticket_id: The ticket being resolved.
account_tier: 'free' or 'pro' โ adjust tone accordingly.
"""
ticket = fake_db.get_ticket(ticket_id)
if not ticket:
return f"No ticket found with ID {ticket_id}."
return (
f"You are a support agent. Write a professional resolution email for the following ticket.\n\n"
f"Ticket #{ticket['id']}: {ticket['subject']}\n"
f"Customer: {ticket['customer']} (account tier: {account_tier})\n\n"
f"Be concise, empathetic, and provide clear next steps."
)
if __name__ == "__main__":
mcp.run()
Building the Client
Client Code
# client/client.py
import asyncio
import json
from typing import Any
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from mcp.shared.context import RequestContext
from mcp.types import (
CreateMessageRequestParams,
CreateMessageResult,
TextContent,
ElicitRequestParams,
ElicitResult,
)
# ๐ง SAMPLING HANDLER โ Sampling is a reverse call: the server asks the client
# to run an LLM completion on its behalf. The client registers this handler so
# that when server.py calls ctx.request_context.session.create_message(), the
# request is routed here. The client controls which model runs โ the server
# never touches API keys. In this example we simulate the LLM response with a
# fixed string; in production you would call OpenAI, Anthropic, or any other
# provider here.
async def sampling_handler(
context: RequestContext[ClientSession, Any],
request: CreateMessageRequestParams,
) -> CreateMessageResult:
prompt_text = " ".join(
m.content.text for m in request.messages if m.content.type == "text"
)
print(f"\n๐ง [Sampling] Server requested LLM completion for: {prompt_text!r}")
# Simulated LLM response โ replace with a real model call in production.
simulated_response = "Ticket escalated due to login failure; requires urgent review."
print(f" โ Simulated response: {simulated_response!r}")
return CreateMessageResult(
role="assistant",
content=TextContent(type="text", text=simulated_response),
model="simulated-model",
stopReason="endTurn",
)
# โ ELICITATION HANDLER โ Elicitation is a reverse call: the server pauses
# execution and asks the user a question through the client's UI. The client
# registers this handler so that when server.py calls ctx.elicit(), the request
# is routed here. The handler collects the user's answer and sends it back to
# the server as structured data matching the schema the server declared.
# In this example we simulate the user picking "high" priority.
async def elicitation_handler(
context: RequestContext[ClientSession, Any],
request: ElicitRequestParams,
) -> ElicitResult:
print(f"\nโ [Elicitation] Server asks: {request.message!r}")
# In a real client this would open a dialog or prompt the user in the UI.
# Here we simulate the user choosing "high" priority.
user_input = "high"
print(f" โ Simulated user input: {user_input!r}")
return ElicitResult(
action="accept",
content={"priority": user_input},
)
SERVER_SCRIPT = "../server/server.py"
async def run_support_agent():
"""
Agent loop that demonstrates all three client-side MCP primitives:
๐ฑ Roots โ negotiated at initialisation so the server knows our workspace.
๐ง Sampling โ handled by sampling_handler when the server requests LLM completions.
โ Elicitation โ handled by elicitation_handler when the server needs user input.
"""
server_params = StdioServerParameters(
command="python",
args=[SERVER_SCRIPT],
)
async with stdio_client(server_params) as (read_stream, write_stream):
async with ClientSession(
read_stream,
write_stream,
# ๐ง SAMPLING โ register the handler the server will call when it needs
# an LLM completion. Without this, ctx.session.create_message() on
# the server would fail with a capability-not-supported error.
sampling_callback=sampling_handler,
# โ ELICITATION โ register the handler the server will call when it
# needs to ask the user a structured question. Without this,
# ctx.elicit() on the server would fail.
elicitation_callback=elicitation_handler,
) as session:
# โโ Handshake โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# ๐ฑ ROOTS โ Roots are negotiated as part of client capabilities
# during the initialize() handshake. The server can then call
# ctx.session.list_roots() at any time to discover which workspace
# paths this client considers its own.
await session.initialize()
print("โ
Connected to MCP server\n")
# โโ Discover capabilities โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
tools_result = await session.list_tools()
print("๐ง Available tools:")
for tool in tools_result.tools:
print(f" โข {tool.name}: {tool.description}")
resources_result = await session.list_resources()
print("\n๐ Available resources:")
for res in resources_result.resources:
print(f" โข {res.uri}: {res.name}")
prompts_result = await session.list_prompts()
print("\n๐ฌ Available prompts:")
for prompt in prompts_result.prompts:
print(f" โข {prompt.name}: {prompt.description}")
print()
# โโ Step 1: Read the open tickets list resource โโโโโโโโโโโโโโโโโโ
# This triggers resource_list_open_tickets on the server, which
# calls ctx.session.list_roots() โ the roots flow in action.
list_resource = await session.read_resource("ticket://list")
print("๐ Open tickets:\n")
for content in list_resource.contents:
print(content.text)
# โโ Step 2: Read a specific ticket resource โโโโโโโโโโโโโโโโโโโโโโ
ticket_id = "42"
ticket_resource = await session.read_resource(f"ticket://{ticket_id}")
print(f"\n๐ Full details for ticket #{ticket_id}:\n")
for content in ticket_resource.contents:
print(content.text)
# โโ Step 3: Fetch the resolution prompt template โโโโโโโโโโโโโโโโโ
prompt_result = await session.get_prompt(
"resolution_prompt",
arguments={"ticket_id": ticket_id, "account_tier": "pro"},
)
print("\n๐ฌ Prompt template from server:")
for msg in prompt_result.messages:
print(f" [{msg.role}] {msg.content.text}")
# โโ Step 4: Call the close_ticket tool โโโโโโโโโโโโโโโโโโโโโโโโโโโ
# In a real agent the resolution text would come from an LLM call
# using the prompt template above. Here we use a fixed string.
resolution = (
"Hi, we have reset your authentication token and confirmed "
"your login is working again. Please reach out if you run into "
"any further issues. โ Support Team"
)
print(f"\n๐ง Closing ticket #{ticket_id}...")
tool_result = await session.call_tool(
"close_ticket_tool",
arguments={"ticket_id": ticket_id, "resolution": resolution},
)
result_data = json.loads(tool_result.content[0].text)
if result_data.get("success"):
print(f"โ
Ticket #{result_data['ticket_id']} closed successfully.")
else:
print(f"โ Error: {result_data.get('error')}")
# โโ Step 5: Call the escalate_ticket tool โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# Calling this triggers both elicitation and sampling on the server:
# โ Elicitation: server calls ctx.elicit() โ routed to elicitation_handler
# ๐ง Sampling: server calls ctx.session.create_message() โ routed to sampling_handler
print(f"\n๐จ Escalating ticket #43...")
escalate_result = await session.call_tool(
"escalate_ticket",
arguments={
"ticket_id": "43",
"reason": "Customer unable to access invoice PDF after multiple attempts.",
},
)
escalate_data = json.loads(escalate_result.content[0].text)
if escalate_data.get("success"):
print(f"โ
Ticket #43 escalated at {escalate_data['priority']} priority.")
print(f" Note: {escalate_data['escalation_note']}")
else:
print(f"โ Error: {escalate_data.get('error')}")
if __name__ == "__main__":
asyncio.run(run_support_agent())
Run the agent from the client/ directory:
python client.py
You should see the agent connect, discover capabilities, read tickets, close one, and escalate another โ with the elicitation and sampling flows printing inline.
MCP Inspector
Before writing a client, you can explore any MCP server interactively using the MCP Inspector:
npx @modelcontextprotocol/inspector python server/server.py
This opens a browser UI where you can:
- Browse all resources and read their contents
- Browse all tools, fill in arguments, and invoke them
- Browse all prompts, supply arguments, and preview the rendered messages
The Inspector is invaluable during development โ it lets you verify that your server exposes exactly the primitives you expect before you wire up a client or an LLM.
graph LR
Inspector["๐ MCP Inspector\n(browser UI)"] -- "stdio" --> Server["๐๏ธ MCP Server\n(your server.py)"]
Inspector --> R["Browse Resources"]
Inspector --> T["Invoke Tools"]
Inspector --> P["Preview Prompts"]
Lifecycle of a Request
Putting everything together, here is the full lifecycle of a single agent action from the moment the LLM decides to close a ticket to the moment the result is returned:
sequenceDiagram
participant LLM as LLM (Agent Brain)
participant Host as Host Application
participant Client as MCP Client
participant Server as MCP Server
LLM->>Host: "Call close_ticket_tool with ticket_id=42"
Host->>Client: Translate decision to tool call
Client->>Server: JSON-RPC tools/call\n{name: "close_ticket_tool", arguments: {...}}
Server->>Server: Execute close_ticket() in fake_db
Server-->>Client: JSON-RPC response\n{success: true, closed_ticket_id: "42"}
Client-->>Host: Tool result
Host-->>LLM: "Tool returned: {success: true}"
LLM->>Host: "Ticket closed. Inform the user."
How to Run the Project
Follow these steps to run the example from scratch.
1. Clone or create the project structure
mcp-support-agent/
โโโ server/
โ โโโ server.py
โ โโโ fake_db.py
โโโ client/
โ โโโ client.py
โโโ requirements.txt
2. Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
3. Install dependencies
pip install -r requirements.txt
Your requirements.txt should contain:
fastmcp
mcp
4. Run the client
The client spawns the server automatically as a subprocess via stdio โ you only need to run one command:
cd client
python client.py
You should see output like this:
โ
Connected to MCP server
๐ง Available tools:
โข close_ticket_tool: Close a support ticket with a resolution message
โข escalate_ticket: Escalate a support ticket...
๐ Available resources:
โข ticket://list: Open tickets list
๐ฌ Available prompts:
โข resolution_prompt: Generate a resolution message for a support ticket
๐ Open tickets:
#42 [open] Cannot log in after password change (alice@example.com)
#43 [open] Invoice PDF not loading (bob@example.com)
๐ Full details for ticket #42: ...
๐ฌ Prompt template from server: ...
๐ง Closing ticket #42...
โ
Ticket #42 closed successfully.
๐จ Escalating ticket #43...
โ [Elicitation] Server asks: 'Ticket #43 needs escalation...'
โ Simulated user input: 'high'
๐ง [Sampling] Server requested LLM completion for: '...'
โ Simulated response: 'Ticket escalated due to login failure; requires urgent review.'
โ
Ticket #43 escalated at high priority.
Note: Ticket escalated due to login failure; requires urgent review.
Summary
MCP is a clean, protocol-level answer to the messy problem of connecting AI models to real-world data and actions. The key ideas to take away:
- Servers expose three primitives: Resources (read-only data), Tools (callable actions), and Prompts (reusable message templates).
- Clients expose three capabilities back: Roots (workspace scope), Sampling (let the server request LLM completions), and Elicitation (let the server ask the user a question).
- All communication is JSON-RPC 2.0 over stdio or HTTP + SSE โ simple, well-understood, and easy to debug.
- FastMCP removes most of the protocol boilerplate on the server side, letting you focus on business logic. The client uses the lower-level
mcpSDK directly. - MCP Inspector gives you a browser UI to explore and test any server before writing a single line of client code.
Once you have an MCP server running for a data source, every MCP-compatible host โ Claude Desktop, custom agents, IDE extensions โ can consume it immediately without any additional integration work. That is the real power of a shared protocol.
