Skip to main content
aifinhub

Playground

Walk-Forward Validation Visualizer

Paste a strategy returns CSV; see in-sample vs out-of-sample Sharpe per window plus the IS→OOS drop. Browser-only deterministic math.

Inputs
Paste + configure
Runtime
1–15 s
Privacy
Client-side · no upload
API key
Not required
Methodology
Open →

Education · Not investment advice. BaFin/EU framework. Past performance does not indicate future results. Editorial standards Sponsor disclosure Corrections

Inputs

Number of windows8
Train fraction per window70%

Window mode

OOS Sharpe drop vs IS

-526%

IS Sharpe -0.20 · OOS Sharpe 0.84 · 8 windows (rolling).

Per-window Sharpe

#IS spanOOS spanIS SharpeOOS SharpeΔ
14419-2.481.70+4.18
24419-0.55-4.283.73
34419-0.813.50+4.31
444190.24-2.092.33
544190.51-3.914.42
64419-2.214.02+6.23
744190.87-0.651.52
844192.868.40+5.53

Reading the result

A drop >50% is a red flag — the strategy looks overfit to the in-sample window. 20–50% is typical for honest strategies once costs are added. <20% is rare and worth verifying. See the methodology.

How to use

Step-by-step

Full calculator guide →
  1. 1

    Upload return data (or strategy backtest results split by parameter combination).

  2. 2

    Set training window (e.g., 3 years) and testing window (e.g., 1 year). Slide the window forward.

  3. 3

    Watch the parameter visualization across windows. Stable parameters = robust strategy; swinging parameters = overfit.

  4. 4

    Read the OOS Sharpe across all test windows. Aggregate OOS Sharpe is what your strategy actually would have produced.

  5. 5

    If parameters swing wildly, simplify the strategy or use shrinkage on the parameter estimates.

Glossary references

Terms used by this tool

All glossary →

Questions people ask next

FAQ

What's walk-forward validation?

Iterative out-of-sample validation: optimize on the first N years, test on years N+1 to N+2, then slide the window forward. Repeat. Each test window is genuinely out-of-sample with respect to the optimization. This is much more robust than a single train/test split.

How is window length chosen?

Two parameters: training window (how many years for optimization) and testing window (how many years to test before re-optimizing). The methodology page recommends 3-5 years training, 1 year testing for most strategies. Shorter testing windows cause excess re-optimization noise; longer testing windows let drift accumulate.

What's the visualization showing?

Each window's parameters as colored bars, with the test-window performance below. Stable strategies show consistent parameters across windows and consistent OOS performance. Unstable strategies show parameters that swing wildly window-to-window — that's the visual signature of overfitting.

Should I aggregate the test windows for a single Sharpe?

Yes, but report it correctly. Aggregated OOS Sharpe is what your strategy would have produced if you actually ran it through walk-forward. That number is meaningful. In-sample Sharpe across the full period is not — it's an average of optimization results, which is biased upward.

How does this compare to k-fold cross-validation?

K-fold is fine for static prediction problems but wrong for time series, because folds shuffle past and future. Walk-forward respects time order. Time-series CV (rolling-origin CV) is mathematically equivalent to walk-forward. The tool implements it.

Complementary tools

Planning estimates only — not financial, tax, or investment advice.