Backtesting & Validation Worked Examples

Deflated Sharpe Ratio: Worked Examples

The deflated Sharpe shows how quickly a good-looking raw Sharpe becomes statistically insignificant once you account for how many strategies you tried. These scenarios apply the Bailey and Lopez de Prado method: the maximum expected Sharpe under the null rises with trial count and falls with sample length. The probabilistic Sharpe (PSR) is the probability the true Sharpe exceeds that benchmark, given length, skew, and kurtosis. All ratios are annualized over twelve monthly periods; normal returns use skew 0 and kurtosis 3.

4 EXAMPLESPublished May 26, 2026Updated May 27, 2026Live Content

By AI Fin Hub Research · AI Fin Hub Team

Best Next MoveCalculators

Deflated Sharpe Ratio Calculator

Bailey & López de Prado deflated Sharpe — corrects observed Sharpe for selection bias across K trials. Reports deflated Sharpe, PSR (probability of skill).

CalculatorOpen ->

On This Page

4 examples Patterns

Worked Examples

See the inputs and outcome together

Each scenario keeps the starting point, the outcome, and the actual lesson in one place so the page reads like a decision notebook, not a data dump.

1

One honest trial

A single pre-registered strategy with a five-year monthly track record and roughly normal returns. You did not search across variants, so there is no selection bias to deflate.

Max expected Sharpe 0.00, PSR 0.999, deflated Sharpe 1.50 (annualized).

Observed Sharpe (annualized)

1.5

Sample length (months)

60

Skew

0

Kurtosis

3

Number of trials

1

A pre-registered single trial carries no luck benchmark, so the deflated Sharpe equals the raw 1.5 and the 0.999 PSR says the edge is real with near certainty. Treat this as the reference point: any research that searched variants must clear a higher bar than this number, so size full conviction here and discount everything that did not pre-register.
2

Fifty trials behind the winner

Same Sharpe, same track record, but you tried fifty parameter variants and reported the best. Selection bias now inflates the benchmark.

Max expected Sharpe 1.03, PSR 0.84, deflated Sharpe 0.47 (annualized).

Observed Sharpe (annualized)

1.5

Sample length (months)

60

Skew

0

Kurtosis

3

Number of trials

50

Fifty trials lift the luck benchmark to a 1.03 annualized Sharpe, so the identical 1.5 track record now only clears it by 0.47. PSR drops from 0.999 to 0.84, meaning a one-in-six chance the edge is pure selection. The strategy is still defensible but no longer a layup, so halve the position you would have run at one trial and demand out-of-sample confirmation before adding.
3

Higher Sharpe but negative skew and a hundred trials

A longer ten-year record with a higher 2.0 Sharpe, but the returns are negatively skewed and fat-tailed, and the strategy emerged from a hundred-trial search.

Max expected Sharpe 0.80, PSR 0.999, deflated Sharpe 1.20 (annualized).

Observed Sharpe (annualized)

2.0

Sample length (months)

120

Skew

minus 0.5

Kurtosis

5

Number of trials

100

Here the longer ten-year sample does the heavy lifting. A hundred trials raise the benchmark to 0.80, but 120 monthly observations shrink the standard error enough that the 2.0 Sharpe still clears it with a 0.999 PSR, even after the negative skew and fat tails widen the denominator. The lesson is the inverse of the short-sample case: with enough data, a genuinely strong Sharpe survives heavy search. Sample length buys you the right to mine.
4

Short sample, modest search

A three-year record with a Sharpe of 1.0 found after twenty trials. Both the short history and the search count work against it.

Max expected Sharpe 1.11, PSR 0.43, deflated Sharpe minus 0.11 (annualized).

Observed Sharpe (annualized)

1.0

Sample length (months)

36

Skew

0

Kurtosis

3

Number of trials

20

This is the trap. Twenty trials on a thin 36-month record push the luck benchmark to a 1.11 annualized Sharpe, just above the observed 1.0, so the deflated value tips negative and PSR sits at 0.43. A coin flip on whether the edge exists at all. Do not allocate to a short-history strategy with this profile until the sample doubles or the trial count drops, because right now you cannot distinguish it from the best of twenty noise draws.

Patterns

At one trial the benchmark is zero and the deflated Sharpe equals the observed Sharpe; every extra trial raises the bar by scaling the expected-max benchmark by the standard error of your Sharpe estimate.

Sample length, not just trial count, decides survival. A 2.0 Sharpe on ten years clears a hundred-trial benchmark, while a 1.0 Sharpe on three years fails after only twenty trials.

The benchmark is the expected-max Sharpe times one over the square root of sample length, so doubling the history roughly halves the haircut a given trial count imposes.

Negative skew and excess kurtosis widen the denominator of the probabilistic Sharpe, lowering PSR for the same observed Sharpe even when the benchmark itself is unchanged.

Try These Tools

Run the numbers next

CalculatorsCalculator

Backtest Overfitting Score

Upload a backtest trade log and compute Probability of Backtest Overfitting (PBO), Deflated Sharpe Ratio, and the odds your edge survives live trading.

Launch toolOpen ->

PlaygroundsCalculator

Walk-Forward Validator

Upload a returns CSV. Rolling or expanding IS/OOS windows, per-window Sharpe, walk-forward efficiency, and a concatenated OOS equity curve. Catches regime.

Launch toolOpen ->

Sources & References

The Deflated Sharpe Ratio: Correcting for Selection Bias, Backtest Overfitting, and Non-Normality — Bailey, D. H. and Lopez de Prado, M., Journal of Portfolio Management (2014)
The Probability of Backtest Overfitting — Bailey, Borwein, Lopez de Prado and Zhu (2017)

Keep the topic connected

Backtesting & Validation1 FAQS

Probability of Backtest Overfitting (PBO) Explained

Probability of Backtest Overfitting (PBO), the Bailey-Lopez de Prado test for how likely your best in-sample strategy underperforms out-of-sample.

Keep readingRead ->

Backtesting & Validation2 FAQS

Overfitting

Overfitting in trading-strategy backtests: how multiple-testing inflates apparent edges and the diagnostics that catch it.

Keep readingRead ->

Backtesting & Validation7 VARIABLES

Deflated Sharpe Ratio Formula

The deflated Sharpe ratio formula: the probability a strategy's Sharpe is real after correcting for the number of trials, return skew, kurtosis, and sample length.

Keep readingRead ->

See the inputs and outcome together

One honest trial

Fifty trials behind the winner

Higher Sharpe but negative skew and a hundred trials

Short sample, modest search

Run the numbers next

Backtest Overfitting Score

Walk-Forward Validator

Keep the topic connected

Probability of Backtest Overfitting (PBO) Explained

Overfitting

Deflated Sharpe Ratio Formula