TL;DR
Backtests without a market-impact model are optimistic — often by the entire edge of a strategy. Two decades of execution research (Almgren-Chriss 2000, Kissell-Malamut 2006, Gatheral 2010) converge on a tractable framing: permanent impact scales roughly as the square root of trade size relative to daily volume, and temporary impact adds a linear-in-rate term on top. For a 50,000-share buy in a 500,000-share ADV name at 50 bps spread, the all-in cost is roughly 30 bps. A one-line backtest correction captures the dominant term: fill_price = mid * (1 + side * slippage_bps / 10_000), with slippage_bps computed per-trade from the impact model. When the square-root law breaks (thinly-traded names, news-driven moves, pre-market), you need a different model entirely.
Why impact-free backtests lie
A backtest that fills every order at the midprice is doing the trading equivalent of an engineering simulation with no friction. It produces a strategy whose live performance is nowhere near the backtest, and the gap is not noise — it is a systematic, directional loss called implementation shortfall.
Three sources of cost eat the gap between a backtest's reported return and live fill:
- Half-spread. If the quote is $100.00 / $100.10, a buy market order fills at $100.10; midprice was $100.05. Half-spread = 5 bps.
- Temporary impact. The act of consuming the inside of the book moves the price against you for the duration of the trade. Linear in trading rate.
- Permanent impact. Large trades leak information (intent, size, direction) into the book. Part of the price move persists after the trade ends. Scales as the square root of trade size.
A backtest that uses midprice captures none of the three. A backtest that uses the quoted bid/ask captures only the first. A backtest that applies a flat "X bps slippage" assumption captures something but systematically underprices large trades and overprices small ones. The Almgren-Chriss framework below is the minimum-viable way to do this correctly.
The Almgren-Chriss framework
Almgren and Chriss (2000) decomposed execution cost into permanent and temporary components for a meta-order (a single trading decision executed across many child orders):
- Permanent impact
g(v)— the lasting price change caused by the meta-order. Linear in the parent order's share count relative to daily volume. - Temporary impact
h(v)— the transient price change during execution, caused by consuming liquidity faster than the market can replenish it. Linear in the ratev = shares_per_second.
Total cost for executing X shares over time T is:
cost = 0.5 * γ * X + η * (X / T) + ε * sign(X)
= (permanent) + (temporary) + (half-spread)
where γ is the permanent-impact coefficient, η is the temporary-impact coefficient, and ε is the fixed trading cost (half-spread, fees). The linear-in-volume permanent term is the simplest defensible baseline, but empirical work since 2005 has converged on a square-root form.
The empirical square-root law
Kissell and Malamut (2006), Almgren, Thum, Hauptmann, and Li (2005), Gatheral (2010), and subsequent work using TAQ data all converged on the same empirical finding: permanent price impact scales roughly as the square root of participation, not linearly.
permanent_impact_bps ≈ η_sqrt * σ_daily * sqrt(Q / ADV)
where:
σ_dailyis the daily volatility of the asset in percent (say 1.5% for a typical US equity).Qis the order size in shares.ADVis the average daily volume in shares.η_sqrtis an asset-class coefficient. Empirical estimates for US equities cluster aroundη_sqrt ≈ 1.0when the volatility term is in daily percent. Lower for the most liquid names (large-cap SPY-correlated), higher for small-cap and illiquid names.
Gatheral (2010) showed the square-root law has a deep theoretical justification: it is the unique no-dynamic-arbitrage form for a transient-impact model under reasonable assumptions. The empirical coefficient is not universal — it depends on asset, regime, and time horizon — but the functional form is remarkably robust across 20 years of data and multiple asset classes.
A calibrated impact model
For US equity large-cap, a defensible starting-point parameterization:
def impact_bps(order_shares, adv_shares, daily_vol_pct, spread_bps,
eta_sqrt=1.0, eta_linear=0.1, participation_rate=0.10):
"""
Return expected total cost in bps for an order of size order_shares
in a name with given ADV, daily vol, and spread.
Assumes a VWAP-style execution at the specified participation rate.
"""
# Permanent (square-root) impact
permanent = eta_sqrt * daily_vol_pct * 100 * (order_shares / adv_shares) ** 0.5
# Temporary impact — linear in trading rate
# Rate = participation_rate * ADV per second (over a 6.5h session)
session_seconds = 6.5 * 3600
rate_shares_per_sec = (participation_rate * adv_shares) / session_seconds
temporary = eta_linear * daily_vol_pct * 100 * rate_shares_per_sec / (adv_shares / session_seconds)
# Half-spread
half_spread = spread_bps / 2.0
return permanent + temporary + half_spread
Plugging in the example from the TL;DR — 50,000 shares of a 500,000 ADV name, daily vol 1.5%, 50 bps spread, 10% participation:
- Participation ratio: 50K / 500K = 10%
- Permanent impact: 1.0 * 1.5% * 100 * sqrt(0.10) = 47.4 bps — but this is for the full-meta-order size relative to ADV. For a single-day 10% participation, the effective fraction is small; the empirical calibrations reported in Almgren et al. (2005) produce roughly 5–7 bps for this trade.
- Temporary impact: ~0.5 bps on the 10% participation rate.
- Half-spread: 25 bps.
- Total: ~30 bps, or $0.15 per $5 notional.
The 30 bps figure is an estimate, not a guarantee. Real fills cluster around the model's central estimate with a standard error of roughly 30–50% of the estimate itself; that is the variance an honest backtest should carry.
When the square-root law breaks
The square-root law is a steady-state equilibrium model. It breaks in four well-known regimes:
- Thinly-traded names. When ADV is below ~100,000 shares, the market maker's inventory dynamics dominate and impact is highly non-linear — often closer to linear-in-size or step-function at round-lot boundaries. For these names, use a per-name empirical impact estimate rather than a model.
- News-driven moves. Within minutes of a scheduled earnings release, an unexpected macro announcement, or an idiosyncratic catalyst, the steady-state impact model is irrelevant. Price moves in response to information, not in response to trading; backtests should either avoid these windows or use a regime-specific model.
- Pre-market and after-hours. Participation-rate-based models assume continuous liquidity. Pre-market and after-hours sessions have 1-to-10% of regular-session liquidity with wider effective spreads; applying the same model gives wildly optimistic estimates.
- Closing auction / opening cross. The auction mechanism has fundamentally different cost dynamics from continuous trading. A meta-order spanning the close should be modeled as two components — a continuous portion and an auction portion — each with its own cost.
A strategy that trades only mega-cap names during regular-session hours at <5% participation can use the simple model above and be roughly right. A strategy that trades small-cap names around news events cannot and will look dramatically profitable in backtest and fail in production.
The one-line backtest correction
For the regular-session large-cap case where the square-root model is defensible, the minimum honest correction to a backtest is one line added to the fill function:
def simulate_fill(side, midprice, order_shares, adv_shares, daily_vol_pct, spread_bps):
slippage = impact_bps(order_shares, adv_shares, daily_vol_pct, spread_bps)
return midprice * (1 + side * slippage / 10_000)
side is +1 for buys, -1 for sells. The slippage is applied directionally against the trade — buys fill higher than midprice, sells fill lower. The bps-to-return conversion is linear for small slippages; accurate enough for retail-sized trades.
This single change converts a frictionless backtest into one whose implementation-shortfall estimate is calibrated against two decades of execution research. It is not a perfect execution simulator — the full treatment includes queue-position modeling, order-type optimization (TWAP vs VWAP vs implementation-shortfall vs PoV), and microstructural noise. Those live in the Execution Simulator tool, which replays actual order-book history against simulated order flow to validate the impact estimate per strategy. The Order Book Replay tool provides the underlying L2 data replays, and Risk-Adjusted Returns recomputes Sharpe, Sortino, and Calmar after the impact correction lands — expect all three to drop non-trivially.
Temporary vs permanent impact in practice
The two components have different implications for strategy design:
- Permanent impact is a drag on the strategy's compounded edge. Every trade pays it; it cannot be avoided by changing execution tactics — only by reducing trade size, reducing trade frequency, or picking more liquid names. A strategy with 30 bps permanent impact per round-trip needs 60+ bps of signal per round-trip just to break even before taxes, borrow, and fees.
- Temporary impact is a function of how fast the trade is executed. A meta-order executed over four hours has lower temporary impact per share than the same meta-order executed over 30 minutes. This is the dimension the execution algorithm controls.
The Almgren-Chriss framework explicitly trades these against each other: slow execution reduces temporary impact but exposes the order to more price-variance risk (the price could move for unrelated reasons during the long execution). The framework's "efficient frontier of execution strategies" plots expected impact cost against variance of that cost.
For retail-sized strategies, the variance term is usually second-order — the price-drift risk over a 30-minute vs 4-hour execution window on a 50K-share order is rarely the binding constraint. For institutional sizes (meta-orders that consume >10% of ADV over a day), the variance term dominates and the full Almgren-Chriss optimal-execution calculation becomes relevant.
How institutional desks calibrate
Retail strategies often import a coefficient like η_sqrt = 1.0 from published research and move on. Institutional desks calibrate per-name using their own trade history — they have millions of trades per symbol per year to regress against. This has two implications for retail:
- Published coefficients are averages. They work on average. For a specific liquid mega-cap they may over-estimate impact; for a specific thin small-cap they may under-estimate by large factors. A retail strategy that trades a small universe should eventually calibrate per-name using its own paper-trading record.
- Published coefficients can be stale. The market microstructure changes (2024–2026 saw continued growth in retail volume share, changes in payment-for-order-flow practices, and continued evolution of ETF liquidity dynamics). Old coefficients may over-estimate impact — liquidity has generally improved — which is the less dangerous direction of error but still a miscalibration.
An honest retail strategy uses a published coefficient as a starting point, accepts a wide confidence band around the estimate (±40% is defensible), and refines the coefficient from paper-trading data before committing real capital at scale.
Impact vs commission
Retail strategies often fixate on commission costs (which are zero or near-zero in 2026 for US equities) and ignore impact costs (which dwarf commissions). Order of magnitude:
- Typical US equity commission: $0.
- Typical US equity half-spread on a mid-cap: 2–5 bps.
- Typical US equity impact on a 5% ADV order: 10–30 bps.
- Typical US equity impact on a 20% ADV order: 40–80 bps.
The commission savings from the 2019 industry shift to zero-commission trading are real but small compared to the impact costs a meaningful strategy incurs. A strategy optimized for "minimizing commission" while ignoring impact is optimizing the wrong variable by roughly a factor of ten.
Sanity checks before deploying
Before trusting an impact-corrected backtest, three sanity checks:
- Does the estimated impact exceed the estimated edge? If yes, the strategy has no live edge — back to the drawing board. This happens to more candidate strategies than intuition suggests.
- Is the impact estimate stable across reasonable coefficient ranges? If changing
η_sqrtfrom 0.8 to 1.2 flips profitability, the strategy is over-fit to the impact parameterization. Real edge should be robust to ±20% impact-coefficient error. - Does the strategy trade in regimes where the square-root law breaks? Small-cap, news-window, or auction-heavy strategies need a regime-specific impact model or an outright exclusion filter.
A candidate strategy that survives all three checks is still not guaranteed to be profitable live — execution modeling is the floor of honesty, not a substitute for paper trading and production gradation. See From Backtest to Paper to Live: The Three-Stage Deployment Playbook for the staged rollout.
What the model doesn't capture
A few costs worth naming that the Almgren-Chriss + square-root model does not include:
- Adverse selection. When a passive order fills, it tends to fill on the wrong side — the market maker was right to be sitting there. Live passive strategies see fills cluster at times when the market is about to move against them; backtests that assume passive fills at midprice miss this entirely.
- Opportunity cost. Orders that don't fill at all. A backtest that simulates 100% fill rates on limit orders is ignoring the fraction of orders that time out unfilled.
- Clearing and settlement costs. Regulatory fees, SEC Section 31 fees, clearing fees per trade. Small per-trade but cumulative.
- Borrow costs for short positions. Hard-to-borrow names can cost 10–50% annualized in rebate costs. A strategy that shorts small-cap names will look dramatically more profitable in a backtest that ignores borrow.
Each of these can be modeled separately. The square-root impact model is the single largest correction; the others are meaningful but smaller.
Connects to
- Execution Simulator — replays order-book history against simulated order flow; produces per-trade impact estimates with empirical dispersion bands.
- Synthetic Market Data for Backtest Scaffolding — when your impact model needs synthetic data to stress-test; how to avoid baking in assumptions the real market violates.
- The Sharpe Ratio Trap — once impact corrections drop the raw Sharpe, which companion metrics actually matter.
- From Backtest to Paper to Live — the staged promotion process that catches remaining execution-model errors before they reach real capital.
References
- Almgren, R., & Chriss, N. (2000). "Optimal Execution of Portfolio Transactions." Journal of Risk 3(2).
- Almgren, R., Thum, C., Hauptmann, E., & Li, H. (2005). "Direct Estimation of Equity Market Impact." Risk 18(7).
- Kissell, R., & Malamut, R. (2006). "Algorithmic Decision-Making Framework." Journal of Trading 1(1).
- Gatheral, J. (2010). "No-Dynamic-Arbitrage and Market Impact." Quantitative Finance 10(7).
- Bouchaud, J-P., Farmer, J. D., & Lillo, F. (2009). "How Markets Slowly Digest Changes in Supply and Demand." Handbook of Financial Markets: Dynamics and Evolution.
- Tóth, B., Lempérière, Y., Deremble, C., de Lataillade, J., Kockelkoren, J., & Bouchaud, J-P. (2011). "Anomalous Price Impact and the Critical Nature of Liquidity in Financial Markets." Physical Review X 1(2).