Skip to main content
aifinhub
Backtesting & Validation Worked Examples

Deflated Sharpe Ratio: Worked Examples

The deflated Sharpe shows how quickly a good-looking raw Sharpe becomes statistically insignificant once you account for how many strategies you tried. These scenarios apply the Bailey and Lopez de Prado method: the maximum expected Sharpe under the null rises with trial count and falls with sample length. The probabilistic Sharpe (PSR) is the probability the true Sharpe exceeds that benchmark, given length, skew, and kurtosis. All ratios are annualized over twelve monthly periods; normal returns use skew 0 and kurtosis 3.

By AI Fin Hub Research · AI Fin Hub Team
Best Next MoveCalculators

Deflated Sharpe Ratio Calculator

Bailey & López de Prado deflated Sharpe — corrects observed Sharpe for selection bias across K trials. Reports deflated Sharpe, PSR (probability of skill).

CalculatorOpen ->

On This Page

Worked Examples

See the inputs and outcome together

Each scenario keeps the starting point, the outcome, and the actual lesson in one place so the page reads like a decision notebook, not a data dump.

  1. 1

    One honest trial

    A single pre-registered strategy with a five-year monthly track record and roughly normal returns. You did not search across variants, so there is no selection bias to deflate.

    Max expected Sharpe 0.00, PSR 0.999, deflated Sharpe 1.50 (annualized).

    Observed Sharpe (annualized)

    1.5

    Sample length (months)

    60

    Skew

    0

    Kurtosis

    3

    Number of trials

    1

    A pre-registered single trial carries no luck benchmark, so the deflated Sharpe equals the raw 1.5 and the 0.999 PSR says the edge is real with near certainty. Treat this as the reference point: any research that searched variants must clear a higher bar than this number, so size full conviction here and discount everything that did not pre-register.

  2. 2

    Fifty trials behind the winner

    Same Sharpe, same track record, but you tried fifty parameter variants and reported the best. Selection bias now inflates the benchmark.

    Max expected Sharpe 1.03, PSR 0.84, deflated Sharpe 0.47 (annualized).

    Observed Sharpe (annualized)

    1.5

    Sample length (months)

    60

    Skew

    0

    Kurtosis

    3

    Number of trials

    50

    Fifty trials lift the luck benchmark to a 1.03 annualized Sharpe, so the identical 1.5 track record now only clears it by 0.47. PSR drops from 0.999 to 0.84, meaning a one-in-six chance the edge is pure selection. The strategy is still defensible but no longer a layup, so halve the position you would have run at one trial and demand out-of-sample confirmation before adding.

  3. 3

    Higher Sharpe but negative skew and a hundred trials

    A longer ten-year record with a higher 2.0 Sharpe, but the returns are negatively skewed and fat-tailed, and the strategy emerged from a hundred-trial search.

    Max expected Sharpe 0.80, PSR 0.999, deflated Sharpe 1.20 (annualized).

    Observed Sharpe (annualized)

    2.0

    Sample length (months)

    120

    Skew

    minus 0.5

    Kurtosis

    5

    Number of trials

    100

    Here the longer ten-year sample does the heavy lifting. A hundred trials raise the benchmark to 0.80, but 120 monthly observations shrink the standard error enough that the 2.0 Sharpe still clears it with a 0.999 PSR, even after the negative skew and fat tails widen the denominator. The lesson is the inverse of the short-sample case: with enough data, a genuinely strong Sharpe survives heavy search. Sample length buys you the right to mine.

  4. 4

    Short sample, modest search

    A three-year record with a Sharpe of 1.0 found after twenty trials. Both the short history and the search count work against it.

    Max expected Sharpe 1.11, PSR 0.43, deflated Sharpe minus 0.11 (annualized).

    Observed Sharpe (annualized)

    1.0

    Sample length (months)

    36

    Skew

    0

    Kurtosis

    3

    Number of trials

    20

    This is the trap. Twenty trials on a thin 36-month record push the luck benchmark to a 1.11 annualized Sharpe, just above the observed 1.0, so the deflated value tips negative and PSR sits at 0.43. A coin flip on whether the edge exists at all. Do not allocate to a short-history strategy with this profile until the sample doubles or the trial count drops, because right now you cannot distinguish it from the best of twenty noise draws.

Patterns

At one trial the benchmark is zero and the deflated Sharpe equals the observed Sharpe; every extra trial raises the bar by scaling the expected-max benchmark by the standard error of your Sharpe estimate.
Sample length, not just trial count, decides survival. A 2.0 Sharpe on ten years clears a hundred-trial benchmark, while a 1.0 Sharpe on three years fails after only twenty trials.
The benchmark is the expected-max Sharpe times one over the square root of sample length, so doubling the history roughly halves the haircut a given trial count imposes.
Negative skew and excess kurtosis widen the denominator of the probabilistic Sharpe, lowering PSR for the same observed Sharpe even when the benchmark itself is unchanged.

Try These Tools

Run the numbers next

Sources & References

Related Content

Keep the topic connected

Planning estimates only — not financial, tax, or investment advice.