Conviction-Scaled Kelly Bet Sizing

TL;DR

Full Kelly maximizes long-run log-growth when your probability estimate is exact. Your probability estimate is never exact. Quarter-Kelly — scaling the Kelly fraction by 0.25 — absorbs the typical miscalibration without sacrificing most of the compound growth. The conviction-tier pattern extends this: map calibrated probability to a discrete conviction bucket (LOW / MEDIUM / HIGH / SUPREME), then apply a tier-specific Kelly fraction with a per-trade cap. Below: the math, why quarter-Kelly is the right default, and the three-line audit check that catches sizing drift before it blows the account.

The Kelly formula

For a bet with known win probability p, payoff ratio b = avg_win / avg_loss, loss probability q = 1 − p:

f_full = (b·p − q) / b

For p = 0.55, b = 1.5: f_full = (1.5·0.55 − 0.45) / 1.5 = 0.25, so full Kelly says bet 25% of bankroll on this trade.

Why that's the wrong bet

f_full = 0.25 assumes p = 0.55 is exact. If your real edge is p = 0.52, full Kelly computes f = 0.1067 — you over-bet by more than 2×. Kelly is brutally unforgiving of over-estimation of p. Over-betting compounds into drawdown faster than under-betting compounds into lost return.

A key result from Kelly literature: bet fractions above full Kelly are strictly worse than bet fractions of full Kelly for long-run growth. There is no regime where over-betting is correct. All of the ambiguity is on the "how much less than full" side.

Why quarter-Kelly

Quarter-Kelly (f = 0.25 · f_full) captures most of full Kelly's long-run growth rate (≈ 87% of expected log-return in standard derivations) while absorbing material miscalibration in p. Empirically, it is the default in every serious Kelly practitioner's writeups from Thorp onward.

Half-Kelly is also defensible when your p estimate has been validated against hundreds of live dated outcomes. For most retail setups whose p comes from an LLM or a backtest, quarter-Kelly is the right starting point.

Conviction-tier mapping

Rather than bet a smooth function of p, map p to a conviction bucket with its own tier:

def conviction_tier(p: float) -> str:
    if p >= 0.85:
        return "SUPREME"
    if p >= 0.70:
        return "HIGH"
    if p >= 0.55:
        return "MEDIUM"
    return "LOW"

def conviction_fraction(tier: str) -> float:
    # Fraction of Kelly to bet for each tier.
    return {
        "SUPREME": 0.25,  # quarter-Kelly; only for validated calibrated edges
        "HIGH": 0.15,     # eighth-to-sixth Kelly
        "MEDIUM": 0.05,   # tiny bet; this tier is mostly noise
        "LOW": 0.0,       # skip
    }[tier]

def sized_fraction(p: float, b: float, tier_cap: float = 0.04) -> float:
    tier = conviction_tier(p)
    kelly = max(0.0, (b * p - (1 - p)) / b)
    return min(kelly * conviction_fraction(tier), tier_cap)

The tier_cap is a hard per-trade percentage-of-bankroll ceiling (4% is a typical retail-algo default). It catches the edge case where the model returns p = 0.99, Kelly computes a large fraction, and even quarter-Kelly is too large.

The three-line audit

Every day, compute and log:

print(f"today_bets: n={n}, total_risked_pct={total_risked:.3%}")
print(f"max_single_bet_pct={max_single:.3%}  (cap={tier_cap:.3%})")
print(f"tier_distribution: {collections.Counter(tiers)}")

If total_risked_pct exceeds ~30% of bankroll on any day, you are over-betting the ensemble. If max_single_bet_pct == tier_cap for many days in a row, your p estimates are hitting the cap's ceiling and the Kelly formula is not the binding constraint — your cap is. That's fine but worth knowing.

Drawdown envelope

Even quarter-Kelly with a per-trade cap produces meaningful drawdowns. The Kelly Sizer runs Monte Carlo paths for a given (p, b, fraction, cap) set and shows the 5–95% band of drawdowns. Rule of thumb for quarter-Kelly on a p = 0.55, b = 1.5 edge: expect a 30% drawdown at some point in any 1,000-trade sequence.

If a 30% drawdown would force you to shut the strategy down, your effective bet size is smaller than quarter-Kelly — size to the drawdown you can actually survive, not the one the formula tolerates.

When not to use Kelly

Strategies without repeatability. Kelly assumes the same bet is available many times. A once-in-a-decade event does not qualify.
Strategies with hidden fat tails. If a "loss" can exceed the Kelly-assumed maximum, the formula breaks. Options selling and illiquid products fall here.
Strategies whose p is correlated across trades. Kelly assumes independence. Highly correlated trades compound loss correlation into real drawdown.
Strategies with trade-level constraints (position limits, venue limits, capital requirements). The Kelly-optimal bet may not be executable.

Connects to

Backtest Overfitting Score — p from a backtest with high PBO is over-stated; size down.
Calibration Dojo — train your probabilistic intuition on binary questions so your p estimates don't start mis-calibrated.
Price-Blind LLM Research Harness — p from a price-informed LLM is mostly confabulation; fix the input before sizing the output.

References

Kelly, J. L. (1956). "A New Interpretation of Information Rate." Bell System Technical Journal 35(4).
Thorp, E. O. (2006). "The Kelly Criterion in Blackjack, Sports Betting, and the Stock Market." Handbook of Asset and Liability Management.
MacLean, L. C., Thorp, E. O., & Ziemba, W. T. (2011). The Kelly Capital Growth Investment Criterion: Theory and Practice.
Poundstone, W. (2005). Fortune's Formula.