Skip to main content
aifinhub
Backtesting & Validation Formula

Deflated Sharpe Ratio Formula

The deflated Sharpe ratio is the probability that an observed Sharpe ratio exceeds a benchmark threshold once you correct for the number of strategy variants tried, the non-normality of returns, and the length of the sample. It turns a raw Sharpe into a calibrated confidence number.

By AI Fin Hub Research · AI Fin Hub Team
Best Next MoveCalculators

Deflated Sharpe Ratio Calculator

Bailey & López de Prado deflated Sharpe — corrects observed Sharpe for selection bias across K trials. Reports deflated Sharpe, PSR (probability of skill).

CalculatorOpen ->

On This Page

Formula

Copy the exact expression or work through it step by step below.

DSR = Phi( ((SR_hat - SR_0) x sqrt(T - 1)) / sqrt(1 - g3 x SR_hat + ((g4 - 1) / 4) x SR_hat^2) ) SR_0 = sqrt(Var(SR)) x ( (1 - gamma) x Z^-1(1 - 1/N) + gamma x Z^-1(1 - 1/(N x e)) )

Variables

SR_hat

Observed Sharpe ratio

The per-period Sharpe ratio estimated from the backtest, before any annualization.

SR_0

Expected maximum Sharpe under the null

The Sharpe you would expect to see by chance as the best of N independent trials. This is the benchmark the observed Sharpe must beat, and it grows with the number of variants tested.

T

Number of return observations

Sample length. Longer samples shrink the variance of the Sharpe estimate and make a high score harder to dismiss as luck.

g3

Skewness of returns

Third standardized moment. Negative skew (occasional large losses) inflates the denominator and lowers the deflated score.

g4

Kurtosis of returns

Fourth standardized moment. Fat tails raise the variance of the Sharpe estimate, so heavy-tailed strategies need a higher raw Sharpe to clear the bar.

N

Number of independent trials

How many strategy configurations were tested before selecting this one. Honest accounting of N is the whole point: hiding it produces an overstated Sharpe.

Phi

Standard normal CDF

Maps the standardized statistic to a probability between 0 and 1. The output is the deflated Sharpe ratio.

Step By Step

  1. 1

    Estimate the per-period Sharpe ratio, skewness, and kurtosis from the strategy's return series.

    Per-period Sharpe 0.12, skew -0.6, kurtosis 5.5 over T = 1000 daily returns.

  2. 2

    Count the number of independent trials N: every variant, parameter grid point, or configuration evaluated before picking this one.

    A parameter sweep over 50 lookback windows and 4 entry rules is N = 200 trials.

  3. 3

    Compute the expected maximum Sharpe SR_0 under the null using N and the variance of the Sharpe estimate.

    With N = 200 the expected best-of-null Sharpe is materially above zero, so a small positive raw Sharpe is unremarkable.

  4. 4

    Plug SR_hat, SR_0, T, skew, and kurtosis into the deflated Sharpe expression and apply the normal CDF.

    A raw Sharpe that looked strong can deflate to a probability below 0.95, meaning it is not distinguishable from the best of the random trials.

  5. 5

    Read the output as a confidence: a deflated Sharpe of 0.95 or higher is the conventional bar for treating the strategy as genuinely skilled rather than selected by luck.

    DSR = 0.62 says there is a substantial chance the result is a product of the search, not a real edge.

Worked Example

Parameter-swept strategy reviewed for overfitting

Per-period Sharpe (SR_hat)

0.12

Observations (T)

1000

Trials (N)

200

Skewness / kurtosis

-0.6 / 5.5

The 200-trial search raises the expected best-of-null Sharpe SR_0 well above the small per-period SR_hat of 0.12, and the negative skew with kurtosis 5.5 widens the estimate's variance, both pushing the standardized statistic down before the normal CDF is applied.

The deflated Sharpe lands below the 0.95 confidence bar. The annualized Sharpe looked attractive, but once the trial count and tail shape are priced in, the result is not separable from selection luck. Re-test on a fresh window before allocating.

Common Variations

Probabilistic Sharpe ratio: the single-trial case where N = 1, reporting the probability the true Sharpe exceeds a benchmark.
Minimum track record length: inverts the formula to solve for the sample length needed to reach a target confidence.
Combinatorially symmetric cross-validation: estimates the probability of backtest overfitting directly from in-sample versus out-of-sample rank degradation.

Try These Tools

Run the numbers next

Sources & References

Related Content

Keep the topic connected

Planning estimates only — not financial, tax, or investment advice.