Table of contents
- MCP Servers and Clients: The Protocol That Connects AI to the Real World
- The Problem MCP Solves
- Concepts
- Architecture Overview
- Server & Client Components
- Server Side
- Client Side
- Transport Layer
- JSON-RPC: The Message Format
- Real-World Example: Support Agent
- Project Structure
- Building the Server
- Building the Client
- MCP Inspector
- Lifecycle of a Request
- Summary
MCP Servers and Clients: The Protocol That Connects AI to the Real World
AI models are brilliant at reasoning — but they live in a box. They can't browse your filesystem, query your database, or call your internal APIs out of the box. Every team that wants to connect an LLM to real data ends up writing their own glue code: custom plugins, one-off integrations, and fragile wrappers that break the moment an API changes.
Model Context Protocol (MCP) is the answer to that chaos. It is an open protocol — originally designed by Anthropic and now community-driven — that defines a standard, transport-agnostic way for AI clients (like a coding assistant or an agent framework) to talk to servers that expose data and actions. Think of it as USB-C for AI context: one standard plug that works everywhere.
This article walks through how MCP is structured on both the server and the client side, explains every primitive with a concrete example, and ends with a fully runnable Python implementation of a support-ticket agent.
The Problem MCP Solves
Imagine you are building an AI-powered customer support agent. The agent needs to:
- Read an open ticket from your ticketing system
- Look up the customer's account details from a database
- Decide whether to escalate or close the ticket
- Write back a resolution note
Without a shared protocol every one of those data sources requires a bespoke integration. With MCP, each data source publishes a small server that speaks a well-known language. The agent (the MCP client) connects to those servers and immediately knows how to discover and call everything they offer — no custom glue needed.
Concepts
Before diving into code, here is a plain-English glossary of every term used in this article.
| Concept | Lives On | Purpose |
|---|---|---|
| Resource | Server | Expose read-only data (files, DB records, API snapshots) |
| Tool | Server | Expose callable actions with side effects |
| Prompt | Server | Expose reusable prompt templates |
| Root | Client | Tell the server which workspace paths the client owns |
| Sampling | Client | Let the server request an LLM completion through the client |
| Elicitation | Client | Let the server ask the user a question through the client UI |
MCP Host — the application that embeds an MCP client. Examples: Claude Desktop, VS Code extensions, custom agent frameworks.
Transport — the wire format used to carry messages between client and server. MCP supports stdio (process pipes) and HTTP + SSE (Server-Sent Events) out of the box.
JSON-RPC 2.0 — the message envelope format MCP uses. Every request and response is a tiny JSON object with a method name, parameters, and an ID for pairing replies to requests.
FastMCP — a Python library that dramatically reduces the boilerplate needed to build an MCP server. It auto-generates the protocol scaffolding so you can focus on your business logic.
MCP Inspector — a developer tool (@modelcontextprotocol/inspector) that lets you connect to any MCP server and explore its resources, tools, and prompts interactively without writing a client.
Architecture Overview
graph TD
Host["🖥️ MCP Host\n(e.g. Claude Desktop, Agent Framework)"]
Client["📡 MCP Client\n(embedded in Host)"]
ServerA["🗄️ MCP Server A\nTicketing System"]
ServerB["🗄️ MCP Server B\nCustomer Database"]
Host --> Client
Client -- "stdio / HTTP+SSE (JSON-RPC)" --> ServerA
Client -- "stdio / HTTP+SSE (JSON-RPC)" --> ServerB
ServerA --> R1["📄 Resources\n(ticket data)"]
ServerA --> T1["🔧 Tools\n(close_ticket, escalate)"]
ServerA --> P1["💬 Prompts\n(resolution_template)"]
ServerB --> R2["📄 Resources\n(account records)"]
ServerB --> T2["🔧 Tools\n(update_account)"]
The host embeds one MCP client instance per server it wants to talk to. Each client opens a transport connection, performs a capability handshake, and then exchanges JSON-RPC messages for the lifetime of the session.
Server & Client Components
The diagram below shows every primitive that each side of the protocol owns, and the bidirectional arrows that exist between them. Servers push data and actions outward; clients reflect workspace context, LLM access, and user interaction back inward.
graph LR
subgraph SERVER["🗄️ MCP Server"]
direction TB
RES["📄 Resources\nRead-only data exposed via URI\ne.g. ticket://42"]
TOOLS["🔧 Tools\nCallable actions with side effects\ne.g. close_ticket()"]
PROMPTS["💬 Prompts\nReusable message templates\ne.g. resolution_prompt"]
end
subgraph CLIENT["📡 MCP Client"]
direction TB
ROOTS["🌱 Roots\nWorkspace paths the client owns\ne.g. file:///home/user/project"]
SAMPLING["🧠 Sampling\nLet server request LLM completions\nvia the client's model access"]
ELICITATION["❓ Elicitation\nLet server ask the user a question\nthrough the client UI"]
end
CLIENT -- "resources/list\nresources/read" --> RES
CLIENT -- "tools/list\ntools/call" --> TOOLS
CLIENT -- "prompts/list\nprompts/get" --> PROMPTS
SERVER -- "roots/list" --> ROOTS
SERVER -- "sampling/createMessage" --> SAMPLING
SERVER -- "elicitation/create" --> ELICITATION
Notice the directionality: most traffic flows from client → server (the client asks for data and calls tools), but three capabilities flow in reverse — the server can query the client's roots, request a model completion, or ask the user a question directly.
Server Side
An MCP server exposes three kinds of primitives: Resources, Tools, and Prompts.
Resources
A resource is a read-only piece of data identified by a URI. Think of it like a REST GET endpoint — no side effects, just data. Resources are ideal for feeding context into the model: open files, database rows, API snapshots.
# Pseudocode — server declares a resource
@server.resource("ticket://{ticket_id}")
def get_ticket(ticket_id: str) -> str:
ticket = db.query("SELECT * FROM tickets WHERE id = ?", ticket_id)
return format_as_markdown(ticket)
The client can list all available resource templates with resources/list and then read a specific one with resources/read.
sequenceDiagram
participant C as MCP Client
participant S as MCP Server
C->>S: resources/list
S-->>C: [{uri: "ticket://{ticket_id}", name: "Support Ticket"}]
C->>S: resources/read {uri: "ticket://42"}
S-->>C: {contents: [{text: "# Ticket 42\nUser cannot log in..."}]}
Tools
A tool is a callable action — it can have side effects (write to a database, send an email, call an external API). The model asks the client to invoke a tool; the client calls the server; the server runs the logic and returns a result.
# Pseudocode — server declares a tool
@server.tool()
def close_ticket(ticket_id: str, resolution: str) -> dict:
db.execute(
"UPDATE tickets SET status='closed', resolution=? WHERE id=?",
resolution, ticket_id
)
return {"success": True, "closed_ticket_id": ticket_id}
Tools are described with a JSON Schema so the model always knows exactly what parameters to pass.
Prompts
A prompt is a reusable message template stored on the server. Instead of hard-coding instructions in every client, you centralise them on the server so they can be versioned and updated in one place.
# Pseudocode — server declares a prompt
@server.prompt()
def resolution_template(ticket_id: str, tone: str = "professional") -> list:
return [
{
"role": "user",
"content": f"Write a {tone} resolution note for ticket {ticket_id}."
}
]
The client fetches prompts with prompts/list and prompts/get, then injects the returned messages into the model's conversation.
Client Side
The client side of MCP is less talked about but equally important. Clients expose three capabilities back to the server: Roots, Sampling, and Elicitation.
Roots
Roots tell the server which filesystem paths or URIs the client considers its workspace. This lets a file-system server know which directories it is allowed to read from, without the client having to repeat that information on every request.
# Pseudocode — client declares its roots during initialisation
client.roots = [
{"uri": "file:///home/user/project", "name": "Current Project"}
]
When the server calls roots/list, the client returns these entries. The server can then scope its resource URIs accordingly.
Sampling
Sampling allows the server to request an LLM completion through the client. This sounds backwards at first — why would a server need the model? — but it enables powerful patterns like a server that uses the model to auto-summarise retrieved documents before returning them.
# Pseudocode — server requests sampling from the client
response = await session.create_message(
messages=[{"role": "user", "content": "Summarise this ticket in one sentence."}],
max_tokens=100
)
summary = response.content.text
The client controls which model is used and can apply its own safety filters before forwarding the completion result back to the server.
Elicitation
Elicitation lets the server ask the user a direct question through the client's UI when it needs clarification that the model cannot provide on its own. For example, a server that manages calendar events might need to ask "Which calendar should I add this to?" before proceeding.
# Pseudocode — server requests user input via elicitation
answer = await session.elicit(
message="Which priority should this ticket be assigned?",
schema={"type": "string", "enum": ["low", "medium", "high"]}
)
The client surfaces the question in its UI, collects the user's answer, and sends it back to the server as a structured value.
Transport Layer
MCP is transport-agnostic. The two standard options are:
stdio — the client spawns the server as a child process and communicates via standard input/output. This is the simplest option and works great for local tools.
HTTP + SSE — the server runs as an HTTP service. The client sends requests over HTTP POST and receives streaming responses via Server-Sent Events. This is the right choice for remote or shared servers.
graph LR
subgraph Local["Local (stdio)"]
C1[Client] -- "stdin/stdout\nJSON-RPC" --> S1[Server Process]
end
subgraph Remote["Remote (HTTP + SSE)"]
C2[Client] -- "HTTP POST\n/messages" --> S2[HTTP Server]
S2 -- "SSE stream\n/events" --> C2
end
JSON-RPC: The Message Format
Every MCP message is a JSON-RPC 2.0 envelope. JSON-RPC is not specific to MCP — it is a tiny, well-understood standard that defines three message shapes:
Request — the caller sends a method name, parameters, and a unique ID.
{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "close_ticket",
"arguments": { "ticket_id": "42", "resolution": "Password reset completed." }
}
}
Response — the receiver sends back the result (or an error) with the same ID.
{
"jsonrpc": "2.0",
"id": 1,
"result": { "success": true, "closed_ticket_id": "42" }
}
Notification — a one-way message with no ID and no expected reply (used for events like notifications/resources/updated).
That's the entire format. MCP builds its entire vocabulary of methods (resources/list, tools/call, prompts/get, etc.) on top of these three shapes.
Real-World Example: Support Agent
The scenario: a support agent that reads open tickets, fetches the customer's account tier, and closes tickets with an AI-generated resolution note.
sequenceDiagram
participant Agent as Agent (LLM)
participant Client as MCP Client
participant Server as MCP Server (Ticketing)
Agent->>Client: Read ticket 42
Client->>Server: resources/read {uri: "ticket://42"}
Server-->>Client: Ticket content (markdown)
Client-->>Agent: Ticket context injected
Agent->>Client: Call close_ticket(42, "Password reset completed.")
Client->>Server: tools/call {name: "close_ticket", arguments: {...}}
Server-->>Client: {success: true}
Client-->>Agent: Tool result returned
Project Structure
mcp-support-agent/
├── server/
│ ├── server.py # MCP server — exposes ticket resources and tools
│ └── fake_db.py # In-memory fake ticket database
├── client/
│ └── client.py # MCP client — connects to server, runs agent loop
├── requirements.txt # Python dependencies
└── README.md
Building the Server
Install Dependencies
pip install fastmcp
fastmcp wraps the lower-level mcp SDK with a decorator-driven API that eliminates most of the protocol boilerplate.
Server Code
server/fake_db.py — a simple in-memory store so the example is self-contained:
# server/fake_db.py
TICKETS: dict[str, dict] = {
"42": {
"id": "42",
"subject": "Cannot log in after password change",
"customer": "alice@example.com",
"account_tier": "pro",
"status": "open",
"resolution": None,
},
"43": {
"id": "43",
"subject": "Invoice PDF not loading",
"customer": "bob@example.com",
"account_tier": "free",
"status": "open",
"resolution": None,
},
}
def get_ticket(ticket_id: str) -> dict | None:
return TICKETS.get(ticket_id)
def list_open_tickets() -> list[dict]:
return [t for t in TICKETS.values() if t["status"] == "open"]
def close_ticket(ticket_id: str, resolution: str) -> bool:
ticket = TICKETS.get(ticket_id)
if not ticket:
return False
ticket["status"] = "closed"
ticket["resolution"] = resolution
return True
server/server.py — the MCP server built with FastMCP:
# server/server.py
from fastmcp import FastMCP
from fake_db import get_ticket, list_open_tickets, close_ticket
# FastMCP creates the server and handles all protocol scaffolding.
mcp = FastMCP(name="support-ticketing-server")
# ── Resources ────────────────────────────────────────────────────────────────
@mcp.resource("ticket://list")
def resource_list_open_tickets() -> str:
"""Returns a markdown summary of all open tickets."""
tickets = list_open_tickets()
if not tickets:
return "No open tickets."
lines = ["# Open Tickets\n"]
for t in tickets:
lines.append(
f"- **#{t['id']}** [{t['account_tier'].upper()}] {t['subject']} "
f"({t['customer']})"
)
return "\n".join(lines)
@mcp.resource("ticket://{ticket_id}")
def resource_get_ticket(ticket_id: str) -> str:
"""Returns full details for a single ticket as markdown."""
ticket = get_ticket(ticket_id)
if not ticket:
return f"Ticket {ticket_id} not found."
return (
f"# Ticket #{ticket['id']}\n"
f"**Subject:** {ticket['subject']}\n"
f"**Customer:** {ticket['customer']}\n"
f"**Account Tier:** {ticket['account_tier']}\n"
f"**Status:** {ticket['status']}\n"
)
# ── Tools ────────────────────────────────────────────────────────────────────
@mcp.tool()
def close_ticket_tool(ticket_id: str, resolution: str) -> dict:
"""
Closes a ticket with a resolution note.
Args:
ticket_id: The ID of the ticket to close.
resolution: A plain-text description of how the issue was resolved.
"""
success = close_ticket(ticket_id, resolution)
if not success:
return {"success": False, "error": f"Ticket {ticket_id} not found."}
return {"success": True, "closed_ticket_id": ticket_id}
@mcp.tool()
def escalate_ticket(ticket_id: str, reason: str) -> dict:
"""
Flags a ticket for human escalation.
Args:
ticket_id: The ticket to escalate.
reason: Why this ticket needs human attention.
"""
ticket = get_ticket(ticket_id)
if not ticket:
return {"success": False, "error": f"Ticket {ticket_id} not found."}
# In a real system this would notify a human queue.
return {
"success": True,
"message": f"Ticket {ticket_id} escalated. Reason: {reason}",
}
# ── Prompts ──────────────────────────────────────────────────────────────────
@mcp.prompt()
def resolution_prompt(ticket_id: str, account_tier: str) -> list[dict]:
"""
Returns a prompt template for generating a resolution note.
Args:
ticket_id: The ticket being resolved.
account_tier: 'free' or 'pro' — adjust tone accordingly.
"""
tone = "warm and detailed" if account_tier == "pro" else "concise"
return [
{
"role": "user",
"content": (
f"Write a {tone} resolution note for support ticket #{ticket_id}. "
"Keep it under 3 sentences. Start with 'Hi,' and end with "
"'— Support Team'."
),
}
]
if __name__ == "__main__":
# Run with stdio transport — the client will spawn this as a subprocess.
mcp.run(transport="stdio")
Building the Client
Client Code
# client/client.py
import asyncio
import json
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
SERVER_SCRIPT = "../server/server.py"
async def run_support_agent():
"""
A minimal agent loop that:
1. Reads open tickets from the MCP server.
2. Prints them so the user can pick one.
3. Fetches the full ticket details as a resource.
4. Fetches the resolution prompt template from the server.
5. Calls the close_ticket tool with a hard-coded resolution
(in production you would pass this through an LLM first).
"""
server_params = StdioServerParameters(
command="python",
args=[SERVER_SCRIPT],
)
async with stdio_client(server_params) as (read_stream, write_stream):
async with ClientSession(read_stream, write_stream) as session:
# ── Handshake ────────────────────────────────────────────────────
await session.initialize()
print("✅ Connected to MCP server\n")
# ── Discover capabilities ────────────────────────────────────────
tools_result = await session.list_tools()
print("🔧 Available tools:")
for tool in tools_result.tools:
print(f" • {tool.name}: {tool.description}")
resources_result = await session.list_resources()
print("\n📄 Available resources:")
for res in resources_result.resources:
print(f" • {res.uri}: {res.name}")
prompts_result = await session.list_prompts()
print("\n💬 Available prompts:")
for prompt in prompts_result.prompts:
print(f" • {prompt.name}: {prompt.description}")
print()
# ── Step 1: Read the open tickets list resource ──────────────────
list_resource = await session.read_resource("ticket://list")
print("📋 Open tickets:\n")
for content in list_resource.contents:
print(content.text)
# ── Step 2: Read a specific ticket resource ──────────────────────
ticket_id = "42"
ticket_resource = await session.read_resource(f"ticket://{ticket_id}")
print(f"\n📝 Full details for ticket #{ticket_id}:\n")
for content in ticket_resource.contents:
print(content.text)
# ── Step 3: Fetch the resolution prompt template ─────────────────
prompt_result = await session.get_prompt(
"resolution_prompt",
arguments={"ticket_id": ticket_id, "account_tier": "pro"},
)
print("\n💬 Prompt template from server:")
for msg in prompt_result.messages:
print(f" [{msg.role}] {msg.content.text}")
# ── Step 4: Call the close_ticket tool ───────────────────────────
# In a real agent the resolution text would come from an LLM call
# using the prompt template above. Here we use a fixed string.
resolution = (
"Hi, we have reset your authentication token and confirmed "
"your login is working again. Please reach out if you run into "
"any further issues. — Support Team"
)
print(f"\n🔧 Closing ticket #{ticket_id}...")
tool_result = await session.call_tool(
"close_ticket_tool",
arguments={"ticket_id": ticket_id, "resolution": resolution},
)
result_data = json.loads(tool_result.content[0].text)
if result_data.get("success"):
print(f"✅ Ticket #{ticket_id} closed successfully.")
else:
print(f"❌ Error: {result_data.get('error')}")
if __name__ == "__main__":
asyncio.run(run_support_agent())
Run the agent from the client/ directory:
python client.py
You should see the agent connect to the server, list capabilities, read ticket data, fetch the prompt template, and close the ticket — all over stdio JSON-RPC.
MCP Inspector
Before writing a client, you can explore any MCP server interactively using the MCP Inspector:
npx @modelcontextprotocol/inspector python server/server.py
This opens a browser UI where you can:
- Browse all resources and read their contents
- Browse all tools, fill in arguments, and invoke them
- Browse all prompts, supply arguments, and preview the rendered messages
The Inspector is invaluable during development — it lets you verify that your server exposes exactly the primitives you expect before you wire up a client or an LLM.
graph LR
Inspector["🔍 MCP Inspector\n(browser UI)"] -- "stdio" --> Server["🗄️ MCP Server\n(your server.py)"]
Inspector --> R["Browse Resources"]
Inspector --> T["Invoke Tools"]
Inspector --> P["Preview Prompts"]
Lifecycle of a Request
Putting everything together, here is the full lifecycle of a single agent action from the moment the LLM decides to close a ticket to the moment the result is returned:
sequenceDiagram
participant LLM as LLM (Agent Brain)
participant Host as Host Application
participant Client as MCP Client
participant Server as MCP Server
LLM->>Host: "Call close_ticket_tool with ticket_id=42"
Host->>Client: Translate decision to tool call
Client->>Server: JSON-RPC tools/call\n{name: "close_ticket_tool", arguments: {...}}
Server->>Server: Execute close_ticket() in fake_db
Server-->>Client: JSON-RPC response\n{success: true, closed_ticket_id: "42"}
Client-->>Host: Tool result
Host-->>LLM: "Tool returned: {success: true}"
LLM->>Host: "Ticket closed. Inform the user."
Summary
MCP is a clean, protocol-level answer to the messy problem of connecting AI models to real-world data and actions. The key ideas to take away:
- Servers expose three primitives: Resources (read-only data), Tools (callable actions), and Prompts (reusable message templates).
- Clients expose three capabilities back: Roots (workspace scope), Sampling (let the server request LLM completions), and Elicitation (let the server ask the user a question).
- All communication is JSON-RPC 2.0 over stdio or HTTP + SSE — simple, well-understood, and easy to debug.
- FastMCP removes most of the protocol boilerplate in Python, letting you focus on business logic.
- MCP Inspector gives you a browser UI to explore and test any server before writing a single line of client code.
Once you have an MCP server running for a data source, every MCP-compatible host — Claude Desktop, custom agents, IDE extensions — can consume it immediately without any additional integration work. That is the real power of a shared protocol.
