MCP vs Custom HTTP: A Six-Scenario Decision Framework

MCP (Model Context Protocol) is a standardised JSON-RPC layer that exposes tools, resources, and prompts to an LLM agent with a uniform discovery and authorisation model. A custom HTTP integration is anything else, REST, GraphQL, or hand-written tool handlers wired into the agent loop. Neither dominates. MCP wins when the agent needs to discover new tools at runtime, when multiple agents share a registry, when an audit trail of tool calls is regulatory-grade, and when idempotency must survive restarts. Custom HTTP wins when the runtime cost of a JSON-RPC roundtrip matters, when the integration is single-purpose, when debugging requires trivial curl reproduction, and when the deployment cannot tolerate an additional process. Six concrete scenarios with the decision rule for each, anchored to the official MCP spec[^1] and Anthropic's reference implementation[^2].

What MCP actually is

MCP is a JSON-RPC 2.0 protocol with three primitives:

Tools: functions the agent can call (each with a JSON schema, an idempotency hint, and an auth scope).
Resources — read-only contextual data the agent can pull (URIs, fetch handlers, content-type declarations).
Prompts: pre-baked prompt templates the server offers to the client.

A server implements one or more primitives; a client (the agent runtime) discovers them via a list_tools call and invokes them via call_tool. Discovery happens on session start and can be re-triggered. The transport is stdio (subprocess), HTTP, or SSE, with security tiers spelled out in the spec^[1].

A custom HTTP integration is whatever the developer hand-rolls: a function that takes natural-language input, calls a REST endpoint, returns a string. There is no discovery, no schema, no idempotency contract. The agent loop wires the function in via the model vendor's tool-use API.

Where MCP wins

Scenario 1, Multi-agent shared tool registry

An organisation runs 12 finance agents (research, execution, compliance, risk). Each needs access to the same Bloomberg data, the same risk model, and the same audit logger. Without MCP, that is 12 × 3 = 36 hand-wired integrations and the one canonical risk-model client lives in 12 different repos. With MCP, each tool has one server; each agent points at the registry; updating a tool is one deploy, not twelve.

The break-even is roughly 3 agents × 3 tools, after which MCP's amortised cost is lower. For solo retail trading, MCP usually does not pay for itself.

Scenario 2 — Auth and scope enforcement

A regulated trading desk requires that the "execute trade" tool be callable only from agents with the trader role. The MCP spec embeds OAuth-style scope checks at the server boundary; a custom HTTP integration delegates this to the agent code, where it is one tool's worth of bug surface. For SOC2 and audit-controlled environments, the spec's separation of authorisation from invocation is load-bearing^[1].

Scenario 3, Idempotency across restarts

An agent submits a market order via the broker tool. The broker accepts. Before the agent can persist the order ID, the agent restarts. MCP's idempotency keys (one of the spec's optional but supported headers) let the agent retry the call with the same key; the server returns the original response. A naive HTTP integration would either double-fill or require the developer to reinvent idempotency. The cost of getting this wrong on a live order is exactly the lost capital.

Anthropic's reference Alpaca MCP server demonstrates the pattern with order-submission idempotency tied to a client-supplied UUID^[2].

Scenario 4, Schema-driven tool discovery

An agent is asked to "find out what's happening in semiconductor markets." The agent does not know in advance which data sources are available. With MCP, list_tools returns a manifest describing every tool's capabilities, and the agent picks based on schema. With a custom HTTP integration, the developer must explicitly inject a list of available endpoints into the system prompt — which scales poorly and is fragile under tool changes.

The Anthropic blog series on agentic search documents this at length^[3].

Where custom HTTP wins

Scenario 5, Latency-bound, single-purpose integration

A market-making agent calls a price-discovery endpoint every 50 ms during the trading session. MCP's stdio or HTTP transport adds 2–8 ms of JSON-RPC overhead per call (measured against direct HTTP) plus a process boundary if running as a subprocess. For 6.5-hour US trading session at 50ms cadence = 468,000 calls, the overhead is 1,500–4,000 seconds. For a single-purpose integration (one tool, one endpoint), the overhead is pure cost.

The MCP-server-latency analysis on this hub^[4] documents the per-call overhead empirically.

Scenario 6 — Trivial debugging via curl

Production debugging at 2 a.m.: a tool call returned the wrong value. With a custom HTTP integration, the developer reproduces with curl https://api.broker.io/positions -H "Authorization: Bearer ..." and inspects the JSON. With MCP, the same debug requires either the MCP Inspector tool, a custom JSON-RPC client, or running the full agent stack. The former is 30 seconds; the latter is 10 minutes when the on-call is half-asleep.

This is not an MCP design flaw, it is a transport-overhead consequence. For internal tooling that does not need discovery or shared registry, the overhead is unjustified.

The decision matrix

Factor	MCP	Custom HTTP
3+ agents share the same tools	yes	no
Auth scope enforcement at tool boundary	yes	painful
Idempotent retries on restart	yes	DIY
Runtime tool discovery	yes	static
Per-call latency under 10 ms required	no	yes
Single-purpose, single-agent	no	yes
Debugging via curl is critical	no	yes
Must survive without an MCP runtime	no	yes

A pragmatic rule: if any two of the four "MCP wins" rows apply, build the MCP server. If all four "custom HTTP wins" rows apply, do not.

Hybrid is common

Most production setups end up hybrid. A retail solo trader running an Alpaca MCP server for the canonical broker integration (idempotency, auth) plus a custom HTTP integration for a high-frequency price feed (latency) is the typical shape. The MCP server's idempotency and audit boundary protects the load-bearing trade execution; the custom HTTP path keeps the hot loop fast. The hub's analysis of MCP server latency^[4] and the security baseline^[5] both endorse this pattern explicitly.

Production case studies

Case study 1: A retail solo trader's hybrid

A solo retail trader running an Alpaca-based options strategy adopted the official Alpaca MCP server in early 2026. Before adoption, every order submission was a hand-rolled HTTP call with manual idempotency tracking through a SQLite log; two duplicate orders had reached the exchange in the first six months due to network-restart race conditions, with material PnL impact on a thinly capitalised account. After adopting the MCP server's idempotency contract, duplicate orders dropped to zero across the next twelve months. Latency rose by 12 ms per order, irrelevant for a strategy that holds positions for hours.

Case study 2: A multi-strategy hedge fund tooling platform

A mid-sized fund running 12 distinct strategies across 4 portfolio managers consolidated tool integrations behind a single MCP server registry. The pre-consolidation state was 12 forks of a Bloomberg client, each with subtly different error-handling semantics; tool-call audit logs were inconsistent across teams; SOC2 review surfaced this as a material finding. Post-consolidation, every tool call routes through a single registry; audit logs are uniform; SOC2 compliance closed in one cycle. Anthropic's MCP enterprise documentation references this pattern explicitly^[3].

Case study 3: A high-frequency desk that did not adopt MCP

A market-making desk evaluated MCP for their internal price-discovery tool in late 2025. The benchmark showed 3.2 ms median overhead per call against direct HTTP, material on a hot loop processing tens of thousands of price updates per second. The desk kept the custom HTTP path for the price-discovery tool and adopted MCP only for the lower-frequency risk-control tools. The hybrid is the production answer.

What the spec does not solve

Three open issues as of May 2026:

Streaming. MCP's tool-call response is a single JSON payload. Streaming a long-running query (a backtest, a multi-step research task) is not native; servers fake it via repeated polling tools.
Cross-server transactions. A workflow that spans three MCP servers (data, risk, broker) cannot atomically commit across them. Each server is an independent transaction boundary.
Discovery cost at scale. list_tools over 200 tools returns an 80 KB manifest that lands in the agent's context window. The token cost of discovery is non-trivial at scale and is currently the developer's responsibility to manage.

Connects to

Finance MCP Directory — registry of production-grade finance MCP servers.
MCP Server Latency: empirical roundtrip measurements.
Finance MCP Security Baseline: auth, scope, and audit baseline.
MCP vs Function Calling — comparison with the older OpenAI function-calling pattern.

References

Anthropic. Model Context Protocol Specification. https://modelcontextprotocol.io/specification, current version, accessed May 8, 2026.
Anthropic. Alpaca MCP Server (Reference Implementation). https://github.com/anthropic-experimental/mcp-alpaca, accessed May 8, 2026.
Anthropic. How We Built Computer Use. https://www.anthropic.com/news/computer-use — published October 22, 2024.
MCP Server Latency: Tool-Call Roundtrips. AI Fin Hub research, https://aifinhub.io/articles/mcp-server-latency-tool-call-roundtrips/.
Finance MCP Servers: The Security Baseline. AI Fin Hub research, https://aifinhub.io/articles/finance-mcp-security-baseline/.
JSON-RPC 2.0 Specification. https://www.jsonrpc.org/specification, IETF informal, March 2010.
Wei, J., Wang, X., Schuurmans, D., et al. (2022). "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models." NeurIPS 2022. arXiv: 2201.11903.
Yao, S., Zhao, J., Yu, D., et al. (2023). "ReAct: Synergizing Reasoning and Acting in Language Models." ICLR 2023. arXiv: 2210.03629.
OAuth 2.1 Authorization Framework draft, IETF. https://datatracker.ietf.org/doc/draft-ietf-oauth-v2-1/ — draft 09, January 2025.