How to Compute a Deflated Sharpe Ratio
A raw Sharpe ratio tells you how good a backtest looks. The deflated Sharpe ratio tells you whether that number means anything once you account for how many strategies you tried and how non-normal the returns are. It is the single most useful guard against fooling yourself with a backtest, and it is computable from five inputs you should already have. Each of the five inputs is covered below, along with how to interpret the result.
On This Page
Before You Start
Set up the inputs that make the next steps easier
Guide Steps
Move through it in order
Each step focuses on one decision so you can keep momentum without losing the thread.
- 1
Gather the five inputs
The deflated Sharpe ratio needs five numbers: the observed Sharpe, the sample length in periods, the return skewness, the return excess kurtosis, and the trial count. The first four describe the backtest you ran; the fifth describes the search that produced it. Missing the trial count is the most common failure, and without it the deflation cannot be done honestly. Assemble all five before computing anything.
Use the Sharpe at the same frequency as your return observations. Mixing an annualized Sharpe with a daily sample length corrupts the standard error.
Use The ToolCalculatorsDeflated Sharpe Ratio Calculator
Bailey & López de Prado deflated Sharpe — corrects observed Sharpe for selection bias across K trials. Reports deflated Sharpe, PSR (probability of skill).
ToolOpen -> - 2
Count the trials, including informal ones
The trial count is the number of distinct strategies you evaluated against the data: every grid point, every entry rule, every universe filter, every threshold you nudged after seeing a result. Informal tuning counts. If you changed a parameter because the backtest looked better, that is a trial. The deflated Sharpe uses the trial count to estimate the maximum Sharpe you would expect from pure luck across that many attempts, which becomes the bar your real Sharpe must clear.
When unsure, overcount rather than undercount. A conservative trial count makes the deflation stricter, which is the safe direction.
Use The ToolCalculatorsBacktest Overfitting Score
Upload a backtest trade log and compute Probability of Backtest Overfitting (PBO), Deflated Sharpe Ratio, and the odds your edge survives live trading.
ToolOpen -> - 3
Compute the expected maximum Sharpe from luck
Given the trial count, the formula estimates the expected maximum Sharpe ratio you would see by selecting the best of that many independent random strategies. This benchmark grows with the number of trials: search a thousand variants and the best one will have a respectable Sharpe by chance alone. The deflated Sharpe measures how far your observed Sharpe sits above this luck benchmark, not above zero.
The luck benchmark rises fast at first and then slowly. Going from 1 to 100 trials matters far more than from 100 to 200, which is why the first few sweeps are the most damaging.
- 4
Adjust for skew and kurtosis
The standard error of a Sharpe ratio is not the textbook value when returns are skewed or fat-tailed, which financial returns almost always are. Negative skew and high kurtosis inflate the true uncertainty of the Sharpe estimate. The deflated Sharpe plugs the third and fourth moments into the standard error so the test reflects the actual shape of your returns rather than assuming a normal distribution that does not hold.
Strategies that sell tail risk often show high Sharpe with strong negative skew. The skew adjustment is exactly what catches these flattering-but-fragile backtests.
- 5
Read the probability and decide
The output is a probability between zero and one: the chance the true Sharpe is positive given the search you performed. The conventional threshold to consider a strategy worth capital is 0.95. A backtest with a raw Sharpe of 1.5 can deflate well below 0.95 once a few hundred trials and fat tails are priced in. If you clear the bar, the edge is plausibly real; if you do not, the result is most likely a selection artifact.
Treat a marginal value near 0.95 as a fail, not a pass. The cheapest way to raise it is more data or fewer trials, never a higher raw Sharpe found by searching harder.
Common Mistakes
The misses that undo good inputs
Computing the deflated Sharpe with a trial count of one
If you searched many variants but record only one trial, the deflation does nothing and the number is as misleading as the raw Sharpe. The trial count is the whole point of the adjustment.
Assuming normal returns and skipping the moment adjustment
Financial returns are skewed and fat-tailed. Using the textbook Sharpe standard error understates uncertainty and inflates the deflated value, defeating the purpose of the test.
Treating 0.95 as a soft target to negotiate down
The threshold exists to keep selection luck out of your capital allocation. Lowering it for a favored strategy reintroduces exactly the bias the test is meant to remove.
Try These Tools
Run the numbers next
Walk-Forward Validator
Upload a returns CSV. Rolling or expanding IS/OOS windows, per-window Sharpe, walk-forward efficiency, and a concatenated OOS equity curve. Catches regime.
Returns Distribution Analyzer
Paste a returns CSV. Histogram, normal-overlay, QQ plot, skewness, excess kurtosis, Jarque-Bera test, tail-weight index. See why Sharpe alone misleads.
FAQ
Questions people ask next
The short answers readers usually want after the first pass.
Sources & References
- The Deflated Sharpe Ratio: Correcting for Selection Bias, Backtest Overfitting and Non-Normality — Bailey and Lopez de Prado, Journal of Portfolio Management (2014)
- The Sharpe Ratio Efficient Frontier — Bailey and Lopez de Prado, Journal of Risk (2012)
Related Content
Keep the topic connected
Deflated Sharpe Ratio Formula
The deflated Sharpe ratio formula: the probability a strategy's Sharpe is real after correcting for the number of trials, return skew, kurtosis, and sample length.
Overfitting
Overfitting in trading-strategy backtests: how multiple-testing inflates apparent edges and the diagnostics that catch it.
Bailey-Lopez de Prado PBO
Probability of Backtest Overfitting: a combinatorial test that estimates how likely your best in-sample strategy is to underperform out-of-sample.
Trading Strategy Validation Checklist
A sign-off checklist for validating a trading strategy before risking capital: data hygiene, out-of-sample testing, trial accounting, deflated Sharpe, and risk backtests.