Reference · 36 terms

Glossary

Canonical definitions for terms used across AI Fin Hub tools, articles, and benchmarks. Each entry is also emitted as a Schema.org DefinedTerm so AI retrievers and Google AI Overviews can cite them directly.

MCP (Model Context Protocol)

An open protocol, specified by Anthropic in November 2024, for connecting LLMs to external tools and data sources via a standardized schema. Servers expose tools (JSON-schema-described actions); clients (like Claude) can discover, inspect, and call them. Transport: STDIO, HTTP with streaming, or SSE.

PBO (Probability of Backtest Overfitting)

A statistic, defined by Bailey, Borwein, Lopez de Prado and Zhu (2016), that measures how often the in-sample-best strategy from a basket ranks below median out-of-sample under Combinatorially-Symmetric Cross-Validation. PBO above 0.5 is a strong overfitting signal.

CSCV (Combinatorially-Symmetric Cross-Validation)

A cross-validation technique that splits observations into 2·S equal chunks, then for every way to pick S chunks as in-sample (and the rest as out-of-sample), scores how the IS-best strategy performs OOS. Produces PBO.

DSR (Deflated Sharpe Ratio)

Bailey & Lopez de Prado (2014): a closed-form probability that an observed Sharpe ratio is statistically real after correcting for selection bias (number of strategies tested) and non-normality of the return distribution. DSR above 0.95 = very likely real.

Kelly criterion

The bet-sizing rule that maximizes long-run log-growth of bankroll given a known win probability and odds. For asymmetric payoffs: f* = (b·p − q) / b. Full Kelly is brutal with estimation error; fractional (quarter / eighth) Kelly is the standard practical default.

Fractional Kelly

Bet size = α · Kelly-optimal, where α ∈ (0, 1). Typical values: 0.5 (half), 0.25 (quarter), 0.125 (eighth). Trades modest growth reduction for substantially lower drawdown and resilience to over-estimated edges.

Sharpe ratio

Sharpe (1966): (mean excess return) / (stdev of excess return). Annualized by √252 on daily data. Does not penalize non-normality or drawdown explicitly.

See also: /tools/risk-adjusted-returns/

Sortino ratio

Variant of Sharpe that uses only downside deviation (negative excess returns) in the denominator. Higher reward for the same mean return when returns have positive skew.

See also: /tools/risk-adjusted-returns/

Calmar ratio

Annualized return divided by max drawdown. Focused on capital preservation. Calmar above 1.5 is strong.

See also: /tools/risk-adjusted-returns/

Omega ratio

Sum of gains above a threshold (usually 0 or risk-free) divided by sum of absolute losses below it. Captures all moments of the return distribution.

See also: /tools/risk-adjusted-returns/

Information ratio

Active return (strategy minus benchmark) divided by tracking error, annualized. Measures consistent outperformance.

See also: /tools/risk-adjusted-returns/

Walk-forward validation

A time-ordered cross-validation technique: fit / select on an in-sample (IS) window, evaluate on the immediately-following out-of-sample (OOS) window, then slide forward and repeat. Concatenated OOS returns approximate live trading performance. Two modes: rolling (fixed IS) and expanding (IS grows).

Walk-forward efficiency

Ratio of mean OOS Sharpe to mean IS Sharpe across walk-forward windows. Values below 0.4 flag overfitting; above 0.7 indicate a persistent edge.

Isotonic regression (PAV)

A non-parametric monotonic transform that maps raw scores to calibrated probabilities. Pool-Adjacent-Violators algorithm merges adjacent score-outcome bins when their frequencies violate monotonicity. Preferred over Platt scaling when miscalibration is non-sigmoid.

Brier score

Mean squared error between predicted probability and binary outcome. Lower is better. A coin-flip forecaster scores 0.25; a perfectly calibrated with perfect info scores 0.

Reliability diagram

Plot of predicted probability (x-axis) against actual frequency (y-axis) across binned predictions. A perfectly calibrated forecaster's points fall on the diagonal y = x.

Black-Scholes

Closed-form option-pricing model (Black & Scholes 1973; Merton 1973) for European options under log-normal underlying with constant vol and rate. Generalization includes continuous dividend yield. Used here for Greeks visualization.

Delta

First derivative of option price with respect to spot. Call delta ∈ (0, 1); put delta ∈ (−1, 0). ATM delta ≈ ±0.5.

Gamma

Second derivative of option price with respect to spot. Peaks at-the-money near expiration; near zero for deep ITM/OTM options. Both calls and puts have positive gamma.

Theta

First derivative of option price with respect to time. Negative for long options (value decays). Accelerates near expiration, especially at-the-money.

Vega

First derivative of option price with respect to implied volatility. Peaks at-the-money; falls off in both ITM and OTM wings. Reported per 1% vol point change.

Idempotency (order submission)

Property of an execution API: submitting the same logical order twice with the same client-supplied key produces the same result (one fill). Critical for retry-on-error loops in agent trading. Alpaca V2 supports it; some community MCP servers do not.

Heartbeat / watchdog / circuit breaker

Three complementary patterns for detecting and halting silent failures in a trading pipeline. Heartbeat = timestamped file written every cycle. Watchdog = independent process reading heartbeat; trips circuit if stale. Circuit breaker = state file every layer honors.

TCO (Total Cost of Ownership)

Sum of all recurring + one-time costs for a given vendor tier across a user's scenario (universe, frequency, history). Applied here to market data API selection.

PFOF (Payment For Order Flow)

Arrangement where a broker routes retail orders to a market maker in exchange for rebates. Often enables zero-commission trading but can degrade execution quality. Common at Robinhood, Alpaca (equities), Tradier.

SIP tape

Securities Information Processor — the consolidated public market data feed covering all US equity exchanges. Full SIP = full-volume visibility. Alternative: IEX-only feeds (partial volume).

Level-2 (L2) market data

Market data that shows the limit order book — multiple price levels on bid and ask with the size at each. Distinct from L1 (top-of-book only) and trades.

Prompt caching (Anthropic)

Anthropic feature where cached input tokens are billed at ~10% the normal input rate for up to 5 minutes after write. Dramatically reduces cost for stable-system-prompt research loops.

Token-cost per validated trade

The most-actionable cost metric for an LLM trading loop: (cost per idea) / (validation rate). Converts raw API cost into economic cost per trade worth making.

Hallucination (LLM)

An output claim by an LLM that is not grounded in the provided context or any real source. In financial extraction: invented numbers, wrong period dates, fabricated line items. Numeric hallucinations are more tractable to detect than prose hallucinations.

Glossary

MCP (Model Context Protocol)

PBO (Probability of Backtest Overfitting)

CSCV (Combinatorially-Symmetric Cross-Validation)

DSR (Deflated Sharpe Ratio)

Kelly criterion

Fractional Kelly

Sharpe ratio

Sortino ratio

Calmar ratio

Omega ratio

Information ratio

Walk-forward validation

Walk-forward efficiency

Isotonic regression (PAV)

Brier score

Reliability diagram

Black-Scholes

Delta

Gamma

Theta

Vega

Idempotency (order submission)

Price-blind research

Heartbeat / watchdog / circuit breaker

TCO (Total Cost of Ownership)

PFOF (Payment For Order Flow)

SIP tape

Level-2 (L2) market data

Prompt caching (Anthropic)

Token-cost per validated trade

Hallucination (LLM)

Signal orthogonality

Conviction tier

BaFin

MiCAR

Synthetic ticker