A single 10bp-depth order-book snapshot at mid 100.02 with 200-share bid top and 180-share ask top returns a spread of 1.9996 bps and top-of-book imbalance of 0.0526 from the Order Book Replay tool. The execution-simulator counterpart prices a hypothetical order against that snapshot using a stylised impact model — useful for fast iteration, but the replay against real snapshots is the only honest test of execution cost. For retail strategies the choice is partly about data availability; the methodology is identical.
TL;DR
- Order book snapshot at mid 100.02: spread 1.9996 bps, top imbalance 0.0526, book imbalance 0.057.
- Total depth: 2,500 shares bid / 2,230 shares ask within 10 bps.
- Simulator: parametric model with stylised market impact, fast.
- Replay: actual book snapshots, true slippage estimates, slower and requires data.
- For backtests of liquidity-sensitive strategies, replay is the honest tool. Simulator is the iteration tool.
The two tools
The Order Book Replay tool returns the depth and imbalance of historical book snapshots. The Execution Simulator prices a hypothetical order using an analytical impact model. Both answer "what would my order cost," but with different assumptions.
Order Book Replay
Input: a sequence of book snapshots (timestamp, bid levels, ask levels). Output for each snapshot:
| Field | Sample value |
|---|---|
| best bid / best ask | 100.01 / 100.03 |
| spread (bps) | 1.9996 |
| top imbalance | 0.0526 |
| book imbalance (10bp depth) | 0.057 |
| bid depth (10bp) | 2,500 |
| ask depth (10bp) | 2,230 |
For a market order of size $N$, the slippage is the weighted average fill price minus mid, computed directly from the visible book. No model assumptions; the book tells the truth (to the extent the book is visible and depth above 10bp matters).
Execution Simulator
Input: an order spec (side, size, urgency) and a model of expected market state. Output: simulated fills with implementation-shortfall estimates. The model uses a stylised impact curve, typically the square-root law $\Delta P \propto \sigma \sqrt{Q/V}$1.
Pros: fast, parameter-driven, works without book data. Cons: model assumptions can break for illiquid names, around the open, or on regime-change days.
When each is right
Replay wins when
- The strategy's slippage estimate has to be defensible against historical reality.
- The book data is available (Databento, Polygon Advanced, or a broker's L2 feed).
- The order sizes are large enough relative to typical book depth that model error matters.
Simulator wins when
- The strategy is in early development and the cost of slippage is bounded by the model.
- Book data is unavailable or expensive.
- The order sizes are small enough (< 5% of typical 10bp depth) that the square-root model is locally accurate.
For most retail strategies on liquid US large-caps, the simulator is sufficient. The book at 10bp depth typically holds thousands of shares; retail orders of 100-500 shares trade through 1-2 ticks of the book and the square-root law's local linearisation is accurate enough.
For strategies on mid-caps, small-caps, or any name with thin book depth, the simulator under-estimates slippage by 20-50%. Replay against actual snapshots is the honest tool.
A worked example
Strategy: market-buy 500 shares of a mid-cap at mid 100.02.
Simulator estimate
Square-root law with $\sigma = 0.5%/day$, $V = 100k$ shares/day average, $Q = 500$:
ΔP ≈ 0.005 × 100.02 × √(500/100000) = 0.005 × 100.02 × 0.0707 ≈ 0.0354
Simulated slippage: 3.5 cents = 3.5 bps. Expected fill: 100.055.
Replay estimate
Against the actual snapshot above (180 shares ask top at 100.03, then 320 shares at 100.04):
- 180 shares fill at 100.03.
- 320 shares fill at 100.04 (the order does not reach the 100.05 level).
- Weighted average fill: (180 × 100.03 + 320 × 100.04) / 500 = 100.0364.
Replay slippage: 1.64 cents = 1.64 bps. Replay says the simulator over-stated slippage by roughly 2x (3.5 bps vs 1.64 bps).
The difference is informative: the actual book happened to be well-loaded at the relevant depth, so a 500-share order clears in the first two ask levels. On a different snapshot (thinner top, fewer shares in the next band), the replay would have shown 5-10 bps. Replay catches the variability that the parametric simulator cannot.
What replay misses
Order Book Replay against historical snapshots is the most honest available test, but it is not perfect:
- Visible vs hidden liquidity. Real fills come from displayed and hidden liquidity. The replay sees only the displayed book.
- Adverse selection. Filling against an old snapshot does not reflect the snapshot-shifting that happens around large orders — informed traders adjust faster than the snapshot rate2.
- Market impact on the trader's own subsequent orders. A single replay does not show how the trader's first fill affects subsequent fills.
For a more honest second-order estimate, run the replay against a sequence of snapshots and re-fill subsequent orders against post-fill book states. This catches own-impact but still misses adverse selection.
Data requirements
Replay requires book snapshots at meaningful frequency:
- For end-of-day fills: one snapshot at close suffices.
- For intraday fills: book snapshots every 5-60 seconds depending on activity.
- For tick-precise replay: full L2 / depth-of-book updates.
Databento, Polygon Advanced, and IBKR's TWS API provide depth-of-book data at retail-accessible prices. Alpaca's SIP feed provides top-of-book + some depth but is shallower than the depth-specialist providers. The Data Vendor TCO tool ranks vendors on depth coverage.
The simulator's role in iteration
For early-stage strategy development, the simulator is the right tool because the iteration loop is fast:
- Idea → backtest with simulator's slippage estimate → check Sharpe.
- If interesting, refine the strategy → backtest again.
- After the strategy is candidate-grade, swap simulator for replay → verify slippage estimate.
- If replay confirms simulator within 20%, deploy. If replay shows materially worse slippage, re-size the strategy.
The simulator is the cheap test; the replay is the expensive (data-cost-wise) confirmation.
Production usage
For a deployed strategy, the right pattern is to log every order's actual fill (from the broker), compare against the simulator's expected fill at submission time, and update the simulator's parameters monthly to track realised slippage. The simulator becomes calibrated to the strategy's specific impact profile rather than relying on a universal square-root law3.
Failure modes
- Skipping the replay confirmation. A strategy whose Sharpe depends on simulator-modelled slippage may evaporate at replay. Confirm before deploying.
- Replay on stale book data. A book snapshot from a different volatility regime under-estimates impact in stormy regimes, where market liquidity itself thins out4.
- Treating simulator output as definitive. It is a parametric model. Real slippage has shape the model cannot capture.
- Ignoring the time-of-day pattern. Slippage at 09:30 open is 2-5x the typical mid-day number. Both tools should be calibrated per time bucket.
Connects to
- Execution Simulation: Slippage Impact — extended treatment.
- Stat-Arb Capacity: Half-Life Sets the Ceiling — slippage as the capacity constraint.
- Market Data APIs Compared 2026 — vendors that ship depth-of-book.
- Rate-Limited, Resumable Market Data Ingestion — getting the book history in the first place.
- Order Book Replay — re-run on your snapshots.
- Order Book Replay methodology — full input/output specification.
References
Footnotes
-
Almgren, R., et al. (2005). "Direct Estimation of Equity Market Impact." Risk 18(7). risk.net ↩
-
Bouchaud, J.-P., Bonart, J., Donier, J., & Gould, M. (2018). Trades, Quotes and Prices: Financial Markets Under the Microscope. Cambridge University Press. cambridge.org ↩
-
Lopez de Prado, M. (2018). Advances in Financial Machine Learning. Wiley. Chapter on execution and microstructure. ↩
-
BIS (2020). "Market liquidity in fixed income markets." BIS Quarterly Review. bis.org ↩
Verified engine output
Show the recompute-verified inputs and outputs
| price | 100.02 |
|---|---|
| size | 5000 |
| depth_bps | 10 |
| snapshots › row 1 › t | 0 |
| snapshots › row 1 › bids › row 1 › price | 100.01 |
| snapshots › row 1 › bids › row 1 › size | 200 |
| snapshots › row 1 › bids › row 2 › price | 100 |
| snapshots › row 1 › bids › row 2 › size | 350 |
| snapshots › row 1 › bids › row 3 › price | 99.99 |
| snapshots › row 1 › bids › row 3 › size | 500 |
| snapshots › row 1 › bids › row 4 › price | 99.98 |
| snapshots › row 1 › bids › row 4 › size | 650 |
| snapshots › row 1 › bids › row 5 › price | 99.97 |
| snapshots › row 1 › bids › row 5 › size | 800 |
| snapshots › row 1 › asks › row 1 › price | 100.03 |
| snapshots › row 1 › asks › row 1 › size | 180 |
| snapshots › row 1 › asks › row 2 › price | 100.04 |
| snapshots › row 1 › asks › row 2 › size | 280 |
| snapshots › row 1 › asks › row 3 › price | 100.05 |
| snapshots › row 1 › asks › row 3 › size | 420 |
| snapshots › row 1 › asks › row 4 › price | 100.06 |
| snapshots › row 1 › asks › row 4 › size | 600 |
| snapshots › row 1 › asks › row 5 › price | 100.07 |
| snapshots › row 1 › asks › row 5 › size | 750 |
| count | 1 |
|---|---|
| depth bps | 10 |
| snapshots › row 1 › index | 0 |
| snapshots › row 1 › t | 0 |
| snapshots › row 1 › best bid | 100.01 |
| snapshots › row 1 › best ask | 100.03 |
| snapshots › row 1 › mid | 100.02000000000001 |
| snapshots › row 1 › spread | 0.01999999999999602 |
| snapshots › row 1 › spread bps | 1.9996000799836051 |
| snapshots › row 1 › bid top size | 200 |
| snapshots › row 1 › ask top size | 180 |
| snapshots › row 1 › top imbalance | 0.05263157894736842 |
| snapshots › row 1 › bid depth | 2500 |
| snapshots › row 1 › ask depth | 2230 |
| snapshots › row 1 › book imbalance | 0.05708245243128964 |
| average spread bps | 1.9996000799836051 |
| average imbalance | 0.05263157894736842 |
Computed live at build time.
Frequently asked questions
- Can I run the simulator without buying book data?
- Yes. The simulator uses parametric estimates derived from publicly available volume and volatility data. For early iteration on liquid US large-caps, that suffices.
- When is replay strictly required?
- For any strategy whose deployed slippage is more than 10-20% of expected per-trade alpha. Below that threshold the simulator's model error is small in absolute terms. Above it, the simulator's model error can flip a strategy's expected P&L.
- What about TWAP and VWAP execution algorithms?
- Both tools can model algorithmic execution by running the order in slices against a sequence of book snapshots. The replay version is closer to honest because the snapshots reflect realised market shape over the execution window.