Build an MCP Server from Scratch: A Complete Developer Guide
A language model cannot query your Postgres instance, call an internal REST endpoint, or read yesterday's runbook from disk — not because the model lacks reasoning, but because it lacks a standardized tool channel. Model Context Protocol (MCP), open-sourced by Anthropic in November 2024, is that channel: Claude, Cursor, Gemini, and custom agents discover and invoke your capabilities through one JSON-RPC surface. This guide is a code-first runbook. By the end you will have built, debugged, and deployed a production-capable MCP Server covering Tools, Resources, and Prompts, with HTTP remote transport and a personal knowledge-base project you can ship today.
1. Three pain points: why AI needs MCP
Before writing code, align on the problems MCP actually solves:
- Tool silos. OpenAI function calling, ChatGPT plugins, and LangChain tool wrappers each define different JSON shapes. Switch model vendors and you rewrite the integration layer — the classic N×M trap.
- Unreachable data. Training cutoffs mean models cannot see your live config, internal docs, or database state unless you expose it through a standard read path.
- No executable actions. Pure chat cannot send HTTP requests, write files, or run SQL. Without a tool layer, agents remain conversational demos.
If you already read our MCP standard decision guide, this article skips the protocol advocacy and goes straight to implementation. Audience: backend or AI engineers comfortable with Python or TypeScript.
2. What MCP is: protocol background and architecture
2.1 Background: from function calling to an open standard
Tool use evolved in three waves: Function Calling (vendor-private JSON) → Plugins (closed ChatGPT ecosystem) → MCP (November 2024 open protocol). MCP's core promise: standardize communication between AI hosts and external capabilities so you implement once and reuse across Claude Desktop, Cursor, VS Code, Gemini CLI, and OpenClaw.
By mid-2026 the ecosystem reports 10,000+ community servers, with OpenAI (January 2026), Google Gemini (February 2026), and AAIF governance under the Linux Foundation. MCP is no longer experimental — it is the default integration surface for agent tooling.
2.2 Architecture: Client ↔ Server ↔ three primitives
┌────────────────────┐ ┌─────────────────────┐
│ MCP Client │ ◄─────► │ MCP Server │
│ (Claude / Cursor) │ JSON │ (your code) │
│ │ -RPC │ │
└────────────────────┘ └─────────────────────┘
│
┌─────────────┼─────────────┐
▼ ▼ ▼
Tools Resources Prompts
(actions) (read-only) (templates)
- Client: the connector inside Claude Desktop, Cursor, or a custom agent runtime.
- Server: your code exposing Tools, Resources, and Prompts.
- Tools: callable functions — search, calculate, query databases, write files.
- Resources: URI-addressed read-only data — configs, profiles, file contents.
- Prompts: parameterized prompt templates the client can inject into conversations.
2.3 JSON-RPC 2.0 and the session lifecycle
MCP runs on JSON-RPC 2.0. Two transports dominate in 2026:
- stdio: the host spawns your server as a subprocess and communicates over stdin/stdout. Latency is typically under 1 ms on the same machine.
- HTTP + SSE / streamable-http: remote service supporting multiple concurrent clients over the network.
Session flow: initialize handshake → capability negotiation (tools/list, resources/list) → request/response (tools/call) → shutdown. Every method is a named JSON-RPC call with typed parameters — no ad-hoc REST paths per tool.
2.4 MCP vs other integration approaches
| Dimension | MCP | OpenAI Function Calling | LangChain Tools |
|---|---|---|---|
| Standardization | Open protocol spec | Vendor-private format | Framework-bound |
| Transport | stdio / HTTP | HTTP only | HTTP only |
| Cross-model | Yes — Claude, GPT, Gemini, etc. | No | Partial |
| Resources / Prompts | First-class primitives | Not supported | Not supported |
| Ecosystem scale (2026) | 10,000+ servers, AAIF governance | Mature but closed | Mature but Python-centric |
3. Development environment and project layout
3.1 Pick a language
- Python (primary in this guide): official
mcppackage withFastMCPdecorators — fastest path for backend engineers. - TypeScript (reference):
@modelcontextprotocol/sdkfor Node and full-stack teams.
3.2 Environment setup
# Python
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install mcp httpx pydantic
# TypeScript (reference)
npm init -y
npm install @modelcontextprotocol/sdk
3.3 Recommended project structure
my-mcp-server/
├── server.py # entry point
├── tools/
│ ├── __init__.py
│ ├── calculator.py
│ └── web_search.py
├── resources/
│ └── file_reader.py
├── prompts/
│ └── templates.py
├── tests/
│ └── test_tools.py
├── pyproject.toml
└── README.md
3.4 Debugging toolchain
- MCP Inspector: official UI for testing Tools, Resources, and Prompts interactively.
- Claude Desktop: edit
claude_desktop_config.jsonfor end-to-end agent testing. - Cursor: configure via Settings → MCP or project
.cursor/mcp.json.
4. Your first MCP Server: Hello World
4.1 Minimal server code
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("my-first-server")
@mcp.tool()
def say_hello(name: str) -> str:
"""Greet a person by name."""
return f"Hello, {name}! This is your first MCP tool."
if __name__ == "__main__":
mcp.run()
4.2 Run and verify
python server.py
# Or debug with MCP Inspector
npx @modelcontextprotocol/inspector python server.py
In Inspector, confirm tools/list returns say_hello, then call it with {"name": "developer"} via tools/call.
4.3 Wire into Cursor or Claude Desktop
Cursor (.cursor/mcp.json):
{
"mcpServers": {
"my-first-server": {
"command": "python",
"args": ["/absolute/path/to/server.py"]
}
}
}
Claude Desktop (macOS ~/Library/Application Support/Claude/claude_desktop_config.json): same structure. Restart the client; say_hello should appear in the agent tool list.
5. Tools: five practical tools, async, and error handling
5.1 Tool design basics
Function signatures become JSON Schema for the model. Use clear docstrings, snake_case names, and verb-first identifiers (search_web, read_file).
5.2 Typed inputs with Pydantic
from pydantic import BaseModel, Field
class SearchInput(BaseModel):
query: str = Field(description="Search keywords")
max_results: int = Field(default=5, description="Maximum results to return")
language: str = Field(default="en", description="Result language code")
@mcp.tool()
def web_search(input: SearchInput) -> list[dict]:
"""Run a web search and return a list of result objects."""
...
5.3 Five tools every production server needs
- Calculator: safe math via
ast.literal_evalor a restricted expression parser — never raweval()on user input. - File read/write: restrict to a whitelist root directory to prevent path traversal.
- HTTP fetch: call external APIs with
httpx, return JSON or plain text. - Database query: read-only SQL with parameterized queries; block DDL statements.
- Datetime utility:
datetime.now(timezone.utc)plus timezone conversion for audit-friendly timestamps.
5.4 Async tools for I/O-bound work
import httpx
@mcp.tool()
async def fetch_url(url: str) -> str:
"""Fetch content from a URL."""
async with httpx.AsyncClient(timeout=30.0) as client:
response = await client.get(url)
response.raise_for_status()
return response.text
5.5 Error handling that agents can reason about
- Return structured JSON for business errors:
{"error": "not found", "code": 404}instead of uncaught stack traces. - Cap external call timeouts at 30 seconds — clients often abort sooner.
- Apply least-privilege checks on filesystem and database operations.
6. Resources: exposing read-only data to the model
6.1 Resource vs Tool
Resources supply data; Tools execute actions. The model reads a Resource by URI without triggering side effects.
6.2 Static and dynamic resources
import json
@mcp.resource("config://app-settings")
def get_app_settings() -> str:
"""Return application configuration."""
return json.dumps({"version": "1.0", "env": "production"})
@mcp.resource("user://{user_id}/profile")
def get_user_profile(user_id: str) -> str:
"""Return profile data for a user ID."""
user = db.query_user(user_id)
return json.dumps(user)
URI schemes include file://, http://, and custom prefixes like config://.
6.3 MIME types and streaming
- Text:
text/plain,application/json - Binary: images, PDFs (Base64 or blob URIs)
- Streaming: real-time feeds via SSE subscriptions where the host supports it
6.4 Filesystem resource server pattern
Typical flow: list directory → read file contents → optionally watch for changes with watchfiles and notify the client. Always configure an allowed root directory whitelist so the model cannot read /etc/passwd or SSH keys.
7. Prompts: reusable templates with dynamic parameters
7.1 What MCP Prompts are
Prompts are pre-built message templates the client can fetch and inject. They support dynamic parameters, improving team consistency and reducing copy-paste drift across agents.
7.2 Creating a prompt template
from mcp.types import PromptMessage, TextContent
@mcp.prompt()
def code_review_prompt(language: str, code: str) -> list[PromptMessage]:
"""Code review prompt template."""
return [
PromptMessage(
role="user",
content=TextContent(
type="text",
text=f"""Review the following {language} code for:
1. Code quality and readability
2. Potential bugs and security issues
3. Performance improvements
```{language}
{code}
```"""
)
)
]
7.3 Multi-turn prompt templates
Define sequences with both user and assistant roles for interview simulations, debugging assistants, or structured onboarding flows. Each PromptMessage specifies role and content; the client injects them in order.
8. HTTP transport and remote deployment
8.1 stdio vs HTTP+SSE decision matrix
| Property | stdio | HTTP + SSE |
|---|---|---|
| Deployment | Local subprocess | Remote server |
| Latency | Very low (<1 ms) | Network-dependent (10–200 ms) |
| Multi-client | Single host process | Concurrent connections supported |
| Best for | Local dev, personal tools | SaaS, team sharing, 24/7 services |
8.2 Enable streamable-http transport
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("remote-server", transport="streamable-http")
if __name__ == "__main__":
mcp.run(host="0.0.0.0", port=8000)
8.3 Authentication and hardening
- Bearer Token: clients send
Authorization: Bearer <token>. - API key middleware: validate keys plus optional IP allowlists.
- CORS: restrict to trusted origins only.
- Rate limiting: cap
tools/callfrequency (e.g., 10 calls per second per client).
9. Debugging, testing, and common failures
9.1 MCP Inspector workflow
- Launch:
npx @modelcontextprotocol/inspector python server.py - Inspect Tools, Resources, and Prompts lists in the UI.
- Manually trigger
tools/calland review request/response JSON. - Simulate timeout and invalid-parameter cases to validate error paths.
9.2 Unit test example
import pytest
from mcp.client.session import ClientSession
from mcp.client.stdio import StdioServerParameters, stdio_client
@pytest.mark.asyncio
async def test_calculator_tool():
server_params = StdioServerParameters(
command="python",
args=["server.py"]
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
result = await session.call_tool("calculate", {"expression": "2 + 2"})
assert result.content[0].text == "4"
9.3 Common failure matrix
| Symptom | Likely cause | Fix |
|---|---|---|
| Tool missing in AI client | Wrong config path, client not restarted | Verify absolute path in mcp.json; restart Cursor or Claude |
| JSON serialization error | Return type not serializable (datetime, Decimal) | Convert to str or dict before returning |
| Session timeout | Tool exceeds client default timeout | Use async with explicit timeout; split long jobs |
| Permission denied | File path outside whitelist root | Configure allowed directory and validate paths |
10. Production deployment: Docker and observability
10.1 Docker containerization
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["python", "server.py"]
10.2 Hosting options (2026 reference costs)
- Railway / Render: one-click deploy for side projects — typically $5–20/month.
- AWS Lambda / Cloud Run: serverless, pay-per-invocation; cold starts can add 200–800 ms latency.
- Self-hosted VPS or remote Mac: Nginx reverse proxy plus launchd supervision for Apple-ecosystem agent stacks needing 24/7 uptime.
10.3 Observability checklist
- Structured logs: record tool name, duration, and status on every
tools/call. - Prometheus metrics: call count, P99 latency, error rate.
- Sentry: alert on uncaught exceptions.
- Health endpoint:
GET /healthreturns 200 plus MCP protocol version.
10.4 Versioning and backward compatibility
Declare MCP protocol version at initialize. Add optional parameters rather than removing required fields. For breaking changes, run v1 and v2 servers in parallel and let clients negotiate via capability flags.
11. Project walkthrough: personal knowledge-base MCP Server
11.1 Requirements
- Let the agent search local Markdown notes semantically.
- Support vector retrieval, not just keyword grep.
- Allow creating and updating notes within a sandbox directory.
11.2 Stack choices
- Vector store: ChromaDB or Qdrant (embedded, zero external ops).
- Embeddings:
text-embedding-3-small(OpenAI) or localnomic-embed-text. - File watcher:
watchfilesto rebuild indexes on save.
11.3 Core tool and resource design
- index_notes: scan the notes directory, chunk, embed, and write to the vector store.
- search_notes: semantic search returning top-K snippets plus source paths.
- write_note: create or append Markdown within the whitelist directory.
- notes://{path} Resource: return full note text for a given path.
11.4 Expected outcome
In Cursor, ask: "What did I write about MCP last week?" The agent calls search_notes, retrieves relevant chunks, and cites source files. Compared to stuffing the entire vault into context, this approach typically saves 90%+ tokens per query while staying current as files change.
12. MCP ecosystem and what comes next
12.1 Reference servers worth studying
- mcp-server-filesystem: directory listing and file I/O.
- mcp-server-github: repos, issues, pull requests.
- mcp-server-brave-search: web search.
- mcp-server-postgres: read-only SQL queries.
- mcp-server-slack: message read and send.
Official spec: modelcontextprotocol.io. Python SDK: github.com/modelcontextprotocol/python-sdk. TypeScript SDK: github.com/modelcontextprotocol/typescript-sdk.
12.2 2026 ecosystem trends
- Major AI clients — Cursor, Claude Desktop, VS Code Copilot, Gemini CLI — ship native MCP support.
- MCP marketplaces and AAIF certification push quality and security baselines.
- Enterprise requirements: OAuth 2.1, granular tool permissions, audit logs.
12.3 Learning path after your first server
- Read the full MCP specification.
- Publish your server to GitHub and register it in community directories.
- Combine MCP with agent skills — see our Cursor Agent Skills guide.
- Contribute Tools or Resources back to open-source reference servers.
13. Summary: from laptop experiments to 24/7 hosting
This guide walked the full MCP Server lifecycle: protocol architecture, environment setup, Hello World, five production Tools with async and structured errors, Resources and Prompts, HTTP remote transport, Inspector debugging, Docker deployment, and a personal knowledge-base project. MCP is the standard wire format for AI tooling in 2026 — building servers is now a core skill for agent engineers.
Local stdio mode has hard limits. Laptop sleep kills subprocesses. Vector indexes and embedding models consume RAM that competes with your IDE. Multiple concurrent servers on one machine trigger CPU throttling during long agent sessions. Knowledge-base MCP servers, HTTP endpoints, and ChromaDB plus embedding API workloads run more reliably on an always-on Apple Silicon node — unified memory helps vector retrieval, and macOS launchd keeps daemons alive without cron hacks.
SFTPMAC remote Mac rental provides a 24/7 Apple Silicon environment tuned for MCP Server and agent workflows: stable HTTP+SSE callbacks, native Python and Node runtimes, and SFTP/rsync for syncing note directories and config — a better production home than a household laptop acting as both dev machine and tool host. Pick one tool to build this week; Hello World takes under ten minutes.