How many trials do I really have if I never ran a formal grid?

Count every parameter you set by looking at backtest results, every rule you added or removed because performance improved, and every universe or window you tried. Informal, manual tuning produces trials just as a grid search does. Practitioners routinely find their honest trial count is in the dozens or hundreds even without an automated sweep.

Can the deflated Sharpe ratio be negative or undefined?

The output is a probability, so it stays between zero and one. It approaches zero when your observed Sharpe is at or below the luck benchmark for your trial count, which is the signal that the result is indistinguishable from selecting the best of many random strategies. It is not undefined as long as the sample length is greater than one.

Does a high deflated Sharpe guarantee the strategy will work live?

No. It guarantees the in-sample result is unlikely to be pure selection luck, which is necessary but not sufficient. Live performance also depends on regime change, capacity, slippage, and whether the data leaked information. Pair the deflated Sharpe with walk-forward validation and a realistic cost and capacity check before committing capital.

Backtesting & Validation Guide

How to Compute a Deflated Sharpe Ratio

A raw Sharpe ratio tells you how good a backtest looks. The deflated Sharpe ratio tells you whether that number means anything once you account for how many strategies you tried and how non-normal the returns are. It is the single most useful guard against fooling yourself with a backtest, and it is computable from five inputs you should already have. Each of the five inputs is covered below, along with how to interpret the result.

8 MIN READPublished May 26, 2026Live Content

By AI Fin Hub Research · AI Fin Hub Team

On This Page

Before you start 5 steps Common mistakes FAQ

Before You Start

Set up the inputs that make the next steps easier

An observed (annualized or per-period) Sharpe ratio from a backtest with costs already subtracted.

The number of return observations the Sharpe was estimated from.

The skewness and excess kurtosis of the return series, or the raw returns to compute them.

An honest count of how many strategy configurations were tried before this one was selected.

Guide Steps

Move through it in order

Each step focuses on one decision so you can keep momentum without losing the thread.

1

Gather the five inputs

The deflated Sharpe ratio needs five numbers: the observed Sharpe, the sample length in periods, the return skewness, the return excess kurtosis, and the trial count. The first four describe the backtest you ran; the fifth describes the search that produced it. Missing the trial count is the most common failure, and without it the deflation cannot be done honestly. Assemble all five before computing anything.

Use the Sharpe at the same frequency as your return observations. Mixing an annualized Sharpe with a daily sample length corrupts the standard error.

Use The ToolCalculators
Deflated Sharpe Ratio Calculator
Bailey & López de Prado deflated Sharpe — corrects observed Sharpe for selection bias across K trials. Reports deflated Sharpe, PSR (probability of skill).
ToolOpen ->
2

Count the trials, including informal ones

The trial count is the number of distinct strategies you evaluated against the data: every grid point, every entry rule, every universe filter, every threshold you nudged after seeing a result. Informal tuning counts. If you changed a parameter because the backtest looked better, that is a trial. The deflated Sharpe uses the trial count to estimate the maximum Sharpe you would expect from pure luck across that many attempts, which becomes the bar your real Sharpe must clear.

When unsure, overcount rather than undercount. A conservative trial count makes the deflation stricter, which is the safe direction.

Use The ToolCalculators
Backtest Overfitting Score
Upload a backtest trade log and compute Probability of Backtest Overfitting (PBO), Deflated Sharpe Ratio, and the odds your edge survives live trading.
ToolOpen ->
3

Compute the expected maximum Sharpe from luck

Given the trial count, the formula estimates the expected maximum Sharpe ratio you would see by selecting the best of that many independent random strategies. This benchmark grows with the number of trials: search a thousand variants and the best one will have a respectable Sharpe by chance alone. The deflated Sharpe measures how far your observed Sharpe sits above this luck benchmark, not above zero.

The luck benchmark rises fast at first and then slowly. Going from 1 to 100 trials matters far more than from 100 to 200, which is why the first few sweeps are the most damaging.
4

Adjust for skew and kurtosis

The standard error of a Sharpe ratio is not the textbook value when returns are skewed or fat-tailed, which financial returns almost always are. Negative skew and high kurtosis inflate the true uncertainty of the Sharpe estimate. The deflated Sharpe plugs the third and fourth moments into the standard error so the test reflects the actual shape of your returns rather than assuming a normal distribution that does not hold.

Strategies that sell tail risk often show high Sharpe with strong negative skew. The skew adjustment is exactly what catches these flattering-but-fragile backtests.
5

Read the probability and decide

The output is a probability between zero and one: the chance the true Sharpe is positive given the search you performed. The conventional threshold to consider a strategy worth capital is 0.95. A backtest with a raw Sharpe of 1.5 can deflate well below 0.95 once a few hundred trials and fat tails are priced in. If you clear the bar, the edge is plausibly real; if you do not, the result is most likely a selection artifact.

Treat a marginal value near 0.95 as a fail, not a pass. The cheapest way to raise it is more data or fewer trials, never a higher raw Sharpe found by searching harder.

Common Mistakes

The misses that undo good inputs

Computing the deflated Sharpe with a trial count of one

If you searched many variants but record only one trial, the deflation does nothing and the number is as misleading as the raw Sharpe. The trial count is the whole point of the adjustment.

Assuming normal returns and skipping the moment adjustment

Financial returns are skewed and fat-tailed. Using the textbook Sharpe standard error understates uncertainty and inflates the deflated value, defeating the purpose of the test.

Treating 0.95 as a soft target to negotiate down

The threshold exists to keep selection luck out of your capital allocation. Lowering it for a favored strategy reintroduces exactly the bias the test is meant to remove.

Try These Tools

Run the numbers next

PlaygroundsCalculator

Walk-Forward Validator

Upload a returns CSV. Rolling or expanding IS/OOS windows, per-window Sharpe, walk-forward efficiency, and a concatenated OOS equity curve. Catches regime.

Launch toolOpen ->

CalculatorsCalculator

Returns Distribution Analyzer

Paste a returns CSV. Histogram, normal-overlay, QQ plot, skewness, excess kurtosis, Jarque-Bera test, tail-weight index. See why Sharpe alone misleads.

Launch toolOpen ->

FAQ

Questions people ask next

The short answers readers usually want after the first pass.

The probabilistic Sharpe ratio gives the probability that a strategy's true Sharpe exceeds a benchmark, accounting for sample length, skew, and kurtosis, but for a single strategy. The deflated Sharpe ratio extends it by also accounting for the number of trials you ran, setting the benchmark to the expected maximum Sharpe from that many attempts. Deflated is the version to use whenever you selected a strategy from a search.

Sources & References

The Deflated Sharpe Ratio: Correcting for Selection Bias, Backtest Overfitting and Non-Normality — Bailey and Lopez de Prado, Journal of Portfolio Management (2014)
The Sharpe Ratio Efficient Frontier — Bailey and Lopez de Prado, Journal of Risk (2012)

Keep the topic connected

Backtesting & Validation7 VARIABLES

Deflated Sharpe Ratio Formula

The deflated Sharpe ratio formula: the probability a strategy's Sharpe is real after correcting for the number of trials, return skew, kurtosis, and sample length.

Keep readingRead ->

Backtesting & Validation2 FAQS

Overfitting

Overfitting in trading-strategy backtests: how multiple-testing inflates apparent edges and the diagnostics that catch it.

Keep readingRead ->

Backtesting & Validation1 FAQS

Probability of Backtest Overfitting (PBO) Explained

Probability of Backtest Overfitting (PBO), the Bailey-Lopez de Prado test for how likely your best in-sample strategy underperforms out-of-sample.

Keep readingRead ->

Backtesting & Validation12 ITEMS

Trading Strategy Validation Checklist

A sign-off checklist for validating a trading strategy before risking capital: data hygiene, out-of-sample testing, trial accounting, deflated Sharpe, and risk backtests.

Keep readingRead ->

Set up the inputs that make the next steps easier

Move through it in order

Gather the five inputs

Count the trials, including informal ones

Compute the expected maximum Sharpe from luck

Adjust for skew and kurtosis

Read the probability and decide

The misses that undo good inputs

Computing the deflated Sharpe with a trial count of one

Assuming normal returns and skipping the moment adjustment

Treating 0.95 as a soft target to negotiate down

Run the numbers next

Walk-Forward Validator

Returns Distribution Analyzer

Questions people ask next

Keep the topic connected

Deflated Sharpe Ratio Formula

Overfitting

Probability of Backtest Overfitting (PBO) Explained

Trading Strategy Validation Checklist