MCP vs Function Calling for Finance Agents

TL;DR

MCP (Model Context Protocol) and function calling are both bridges between an LLM and external tools. They solve overlapping problems with different tradeoffs. MCP wins for standard data-source integrations where a reusable server already exists: Alpaca has an official MCP server, Polygon has one, FRED and FMP have community ones. A practitioner connecting Claude Desktop to any of those writes zero integration code. Function calling wins for bespoke, latency-sensitive, strategy-specific logic: the model gets a tool schema inline with the API request and the caller owns everything end to end. The right answer for a production finance agent is usually both, layered. MCP handles the data-fetch layer where vendors publish servers; function calling handles decision code, risk checks, and execution where every millisecond and every idempotency guarantee is owned in-house. The Finance MCP Directory catalogues which data sources have usable servers today.

What MCP is, concretely

MCP is an open protocol specified by Anthropic in November 2024¹ and adopted broadly through 2025 across Claude Desktop, Claude Code, Cursor, Zed, and several custom agents. The protocol defines three primitives (tools, callable functions; resources, read-only references like files or URIs; and prompts, reusable templates) exchanged over JSON-RPC 2.0. Transports are stdio (subprocess), Server-Sent Events (being deprecated), and streamable HTTP / WebSocket. An MCP client (the LLM host) connects to one or more MCP servers, fetches their advertised capabilities during a handshake, and then invokes tools by name when the model requests them. Servers are written once and consumed by every compliant client. Alpaca publishes one. Polygon publishes one. FRED, FMP, and Databento have community servers of varying quality.²

What function calling is, concretely

Function calling is a model-API feature, not a protocol. The request body to Anthropic's Messages API, OpenAI's Chat Completions, or Gemini's generateContent includes a tools array. Each entry is a JSON schema describing a function name, description, and input shape. During inference the model may return a tool_use content block instead of (or alongside) text. The calling application executes the function locally, appends a tool_result block to the conversation, and sends the next request. Every integration is application-specific code: if two agents need the same market-data tool, both applications implement it independently. Anthropic's documentation covers the mechanics in detail.³

The axes that actually matter

Evaluating MCP against function calling as a binary is the wrong frame. They occupy different positions on seven axes, and the choice on each axis depends on the agent's deployment shape.

Axis	MCP	Function calling
Reuse across agents	High. One server, many clients.	Low. Per-application code.
Latency overhead	Higher. Handshake, extra hop, JSON-RPC framing.	Lower. Direct in-process call after model returns tool_use.
Schema versioning	Server-owned. Clients discover on connect.	Application-owned. Baked into request body.
Auth model	Server boundary. One auth config per server.	Per-call. Application injects credentials at execution.
Available today for finance	Alpaca (official), Polygon (official), FRED (community), FMP (community), Databento (community), IBKR CLI (community), Tradier (community).	Everything with a public API.
Best for	Standard data fetch, broad-spectrum research agents.	Bespoke logic, tight-latency decision loops, private strategy code.
Security surface	One well-defined boundary per server.	Scattered across every tool handler.

Read the table row by row. Reuse and latency are directly at odds: the property that makes MCP powerful across agents (an explicit transport boundary) is the same property that adds overhead on every call. A research-grade agent running at human speed doesn't notice thirty extra milliseconds. A market-making loop does.

Schema versioning deserves a second look. MCP servers own the schema; when Alpaca's underlying API adds a required field, the official server updates, and every client picks it up on the next connect. Function-calling code has to be updated in every application independently. That asymmetry cuts the other way when the schema is internal: a private risk-check function doesn't need a protocol; it needs to be a function in the same process.

Three real architectures

Three deployment shapes cover most finance-agent builds in 2026. Each has a clear sweet spot.

Pure MCP

Claude Desktop or Claude Code as the host. Alpaca MCP for brokerage, Polygon MCP for historical bars, FRED MCP for macroeconomic series, a filesystem MCP for notes. The practitioner writes zero tool-integration code. Adding a new data source is an edit to the client config file. Latency is not a concern because every query is human-initiated.

This setup is ideal for research, ad-hoc analysis, and one-person quant workflows. It is also the shape recommended in the Finance MCP Security Baseline for operators who want a well-defined scope boundary at each vendor. The cost is that every execution-scope MCP server introduces a supply-chain dependency; see criteria 1, 2, and 5 of the baseline rubric before wiring up a community server with trading authority.

Pure function calling

A custom Python application built on the Anthropic SDK or OpenAI SDK. Alpaca hit directly via its REST or WebSocket client. Polygon, FRED, and any internal data store wrapped in application-local Python functions. Tool schemas assembled at startup from function signatures (via pydantic or a decorator pattern) and sent inline with each request.

Everything is in-process after the model returns tool_use. Latency is whatever the underlying API costs plus one Python function call. The application owns schemas, auth, retries, and idempotency end to end. Versioning is whatever the team commits to git.

This shape suits production trading agents where the tool set is small, private, and performance-sensitive. A market-making loop that calls place_order, cancel_order, and get_book doesn't need a protocol. It needs those three functions to work in five milliseconds.

Hybrid

The production-grade default. MCP for the data layer where official vendor servers exist. Function calling for decision code, risk checks, sizing logic, and execution that the operator owns.

A concrete layout: Polygon MCP for historical bars and reference data; FRED MCP for macro; a custom submit_order function that wraps the broker's REST client directly, enforces idempotency keys, and runs pre-trade risk checks in the same process. The LLM sees all of these as tools and doesn't care which are served by MCP and which are function-calling handlers. The operator gets reuse where reuse is cheap (data) and control where control matters (execution).

The latency budget points inside: anything on the hot path of a decision is function calling; anything on the slow path of research is MCP. The Trading System Blueprinter encodes this split as a default pattern.

Security implications

MCP and function calling distribute security surface differently. Neither is inherently safer. The tradeoff is between a concentrated, well-defined boundary and a scattered, implicit one.

An MCP server is a single process with its own auth configuration, network reach, and input validation. The attack surface for an execution-scope server is one codebase to audit: credentials, tool handlers, and idempotency guarantees all sit in the same repository. The Finance MCP Security Baseline grades servers on exactly this rubric, because everything that matters lives at the server boundary.

Function-calling tools distribute the same concerns across every handler. The application's Alpaca client holds credentials; the application's Polygon client holds a different set; each tool function independently validates inputs. If the schemas are loose and the application has twenty tools, there are twenty places a prompt-injected tool call with invented arguments can do damage. Tight schemas and shared validation helpers reduce this, but the boundary is implicit.

Neither model stops prompt injection. Both require the countermeasures catalogued in Prompt Injection Defenses for Finance Agents: quarantine untrusted content, validate tool arguments against allowlists, and keep a human in the loop for material actions. What the choice between MCP and function calling changes is where those defenses live. MCP consolidates them at the server; function calling distributes them across the application.

One practical corollary: an execution-scope MCP server needs strong auth and idempotency in one place; a function-calling execution handler needs per-function input validation on every trading call. The total amount of defensive code is similar. The structure is different.

When to switch

A practitioner building a first finance agent almost always starts with pure function calling. The model provider's SDK documentation walks through it in under an hour, and every API call stays in the same process. Adoption friction is minimal.

The switch to hybrid comes when a second or third agent in the same setup needs the same data source. The first copy-paste of a get_polygon_bars handler into a second application is the signal. A rough rule: more than two agents reusing the same data source is the point where an MCP server (official if one exists, carefully audited community otherwise) pays off.

The switch to pure MCP is mostly a deployment-topology decision. Claude Desktop users default there because the host doesn't give them any other option. Custom-agent builders running Claude Code or a scripted client rarely end up at pure MCP, because they always have some private code that doesn't deserve a protocol.

One asymmetry worth naming: migrating from function calling to MCP is cheap because the tool schemas can be ported as-is. Migrating from MCP to function calling is harder because the server may own behavior (retry, caching, multi-step workflows) that wasn't visible through the tool surface. Start with function calling where the logic is yours, and adopt MCP only for vendor-provided servers or for tools that a clear second consumer already needs.

A second asymmetry: observability. Function-calling tool invocations are already inside the application's own logging and tracing setup; adding structured logs to a tool handler is a one-line change. MCP tool invocations cross a process boundary, so the host has to record the request and response, and the server has to record its side independently. The two logs have to be reconciled after the fact, usually by correlation ID. This is solvable, but it adds a line item to the agent's telemetry design that pure function calling does not require. Operators evaluating the switch to MCP should budget for it up front rather than discovering the gap when they need to debug a failed tool call three weeks into production.

A worked example of the hybrid pattern

A retail options-selling agent running on Claude Code. The agent's job at open: pull the prior day's IV rank for a synthetic universe, fetch current macro context, size a proposed structure, submit the order.

The data pull (IV history, macro series) goes through MCP: Polygon MCP for options metadata, FRED MCP for the two-year yield. Neither is on the hot path; the agent runs once per session. The schemas are vendor-owned. Reuse is plausible if a second agent later wants the same macro series.

The sizing and submission are function calling. A size_short_strangle function lives in the repo, takes IV rank + account equity, returns a contract count bounded by conviction-scaled Kelly. A submit_order function wraps the brokerage REST client, enforces a client-supplied idempotency key, and rejects any order whose size exceeds a hard cap. Neither function is reusable across agents in any meaningful way (both encode one strategy's risk model), and both need to run in-process for latency and auditability.

The LLM sees four tools: polygon__get_option_chain, fred__get_series, size_short_strangle, submit_order. It doesn't know which are MCP and which are function calling. The split is invisible to the model and load-bearing for the operator.

The invisibility cuts both ways. An agent that mistakenly calls submit_order with unsafe arguments fails at the function-calling boundary where input validation runs in-process and can be audited line by line. An agent that mistakenly calls polygon__get_option_chain with a malformed symbol fails at the MCP server, which returns a structured error the host surfaces back to the model. Both failures are recoverable. The difference is where the recovery logic lives, and therefore where the postmortem reads begin when something goes wrong. Architectural choices that seem small at build time become the shape of every incident review later.

A short note on the protocol's trajectory

MCP is young. Server quality varies widely, as the Finance MCP Directory grade distribution shows. A nontrivial fraction of community servers are last-touched more than ninety days ago or ship schemas with loose typing. That is not an argument against MCP; it is an argument for grading what gets wired in. The protocol itself is sound, broadly adopted, and converging on sensible defaults. The ecosystem will harden. A practitioner choosing between MCP and function calling today is not choosing between two finished products; the choice is between a protocol with a maturing ecosystem and a model-API feature that works everywhere but owns nothing.

Connects to

Observability for LLM Trading Agents — telemetry applies to both tool surfaces; the logging shape differs.
Prompt Injection Defenses for Finance Agents — defensive layers land differently under MCP versus function calling.
Finance MCP Servers: The Security Baseline — the grading rubric referenced inline.
A Production Claude Agent for Finance — end-to-end build that lands on the hybrid pattern.
Finance MCP Directory — catalogue of graded finance MCP servers.
Agent Skill Tester — regression harness that works against both MCP and function-calling tool sets.
Trading System Blueprinter — starter blueprints that codify the MCP-for-data, function-calling-for-decisions split.

References

Model Context Protocol specification, current draft at https://spec.modelcontextprotocol.io/specification/ (accessed 2026-04-22). JSON-RPC framing, tool / resource / prompt primitives, transport definitions.
Alpaca. "Alpaca MCP Server V2" announcement and repository at https://github.com/alpacahq/alpaca-mcp-server (accessed 2026-04-15). Idempotency key requirement and scope options.
Polygon.io. "Polygon MCP Server" repository at https://github.com/polygon-io/mcp-polygon (accessed 2026-04-10). Read-only data surface, official maintenance.
Anthropic. "Tool use with Claude" documentation at https://docs.anthropic.com/claude/docs/tool-use (accessed 2026-04-22). Request schema, tool_use / tool_result content blocks, streaming behavior.
OpenAI. "Function calling" documentation at https://platform.openai.com/docs/guides/function-calling (accessed 2026-04-22). Comparable mechanics and schema shape.
Google. "Function calling" in the Gemini API documentation (accessed 2026-04-22). Third-provider confirmation of the common pattern.

Anthropic. "Introducing the Model Context Protocol." November 25, 2024. Specification at https://spec.modelcontextprotocol.io (accessed 2026-04-22). ↩
Finance MCP server survey conducted for the Finance MCP Directory; grade distribution as of 2026-04. Alpaca and Polygon are the only vendor-official servers in the finance category at time of writing. ↩
Anthropic tool-use documentation, linked above. The tool_use and tool_result content-block shapes are the concrete mechanism; the broader pattern is identical across OpenAI and Gemini with minor naming differences. ↩