Skip to main content
aifinhub

Methodology · Tool · Last updated 2026-05-08

How Walk-Forward Validation Visualizer works

How the Walk-Forward Validation Visualizer splits a returns series and computes per-window Sharpe.

Inputs

  • A CSV with at least a strategy returns column. Optional date and benchmark columns are recognised.
  • Number of windows N (2–20).
  • Train fraction per window (0.5–0.9).
  • Mode: rolling (fixed-length window slides forward) or anchored (in-sample anchored to t=0).

Window construction

For N windows on n observations:

window_span  = floor(n / N)
train_span   = floor(window_span · train_pct)
test_span    = window_span − train_span

Rolling window k:
  IS  = [k · test_span,  k · test_span + train_span)
  OOS = [k · test_span + train_span,  k · test_span + train_span + test_span)

Anchored window k:
  IS  = [0,  train_span + k · test_span)
  OOS = [train_span + k · test_span,  train_span + (k+1) · test_span)

Per-window Sharpe

Annualized using a 252-day factor:

SR_window = mean(returns) / stdev(returns) · √252

IS → OOS drop

The hero number is the proportional Sharpe loss:

drop = (mean(IS_Sharpe) − mean(OOS_Sharpe)) / |mean(IS_Sharpe)|

A drop above 50% is a strong overfitting signal. Drops in the 20–50% range are typical for honest strategies once costs are added; sub-20% drops are uncommon and worth verifying for look-ahead leakage.

References

  • Pardo, R. (2008). The Evaluation and Optimization of Trading Strategies, 2nd ed., Wiley. ISBN: 978-0-470-12801-5.
  • Bailey, D. H., Borwein, J., López de Prado, M., Zhu, Q. J. (2014). "The probability of backtest overfitting." Journal of Computational Finance 20(4): 39–69. DOI: 10.21314/JCF.2016.322.

Limitations

  • The tool tests realised Sharpe stability. It does not refit a model — you supply pre-computed strategy returns.
  • Overlapping returns (intraday holding periods, leveraged ETFs) violate i.i.d. assumptions inside Sharpe; consider Newey-West-adjusted variants.
  • Anchored mode reuses early in-sample rows in every window — a single bad early sample biases all in-sample Sharpes.

External resources

Planning estimates only — not financial, tax, or investment advice.