Methodology: Returns Distribution Analyzer

Worked example

Running the shipped returns-distribution-analyzer engine on the input below produces exactly this output. Continuous integration recomputes it against the engine bundle on every build, so these numbers cannot drift from the code.

Input

{
  "tool": "returns-distribution-analyzer",
  "bins": 30,
  "returns": [
    0.01,
    0.02,
    -0.005,
    0.015,
    -0.01
  ]
}

Output

{
  "n": 5,
  "mean": 0.005999999999999998,
  "stdev": 0.012942179105544785,
  "median": 0.01,
  "skewness": -0.17436912281206235,
  "excessKurtosis": -2.106010247271107,
  "jbStat": 0.949353651160813,
  "jbPValue": 0.6220860662860637,
  "tailExcessRatio": 0,
  "negTailMass": 0,
  "posTailMass": 0,
  "histogram": [
    {
      "binLow": -0.01,
      "binHigh": -0.009000000000000001,
      "count": 1
    },
    {
      "binLow": -0.009000000000000001,
      "binHigh": -0.008,
      "count": 0
    },
    {
      "binLow": -0.008,
      "binHigh": -0.007,
      "count": 0
    },
    {
      "binLow": -0.007,
      "binHigh": -0.006,
      "count": 0
    },
    {
      "binLow": -0.006,
      "binHigh": -0.005,
      "count": 0
    },
    {
      "binLow": -0.005,
      "binHigh": -0.004,
      "count": 1
    },
    {
      "binLow": -0.004,
      "binHigh": -0.003,
      "count": 0
    },
    {
      "binLow": -0.003,
      "binHigh": -0.002,
      "count": 0
    },
    {
      "binLow": -0.002,
      "binHigh": -0.0009999999999999992,
      "count": 0
    },
    {
      "binLow": -0.0009999999999999992,
      "binHigh": 0,
      "count": 0
    },
    {
      "binLow": 0,
      "binHigh": 0.0009999999999999992,
      "count": 0
    },
    {
      "binLow": 0.0009999999999999992,
      "binHigh": 0.002,
      "count": 0
    },
    {
      "binLow": 0.002,
      "binHigh": 0.003000000000000001,
      "count": 0
    },
    {
      "binLow": 0.003000000000000001,
      "binHigh": 0.004,
      "count": 0
    },
    {
      "binLow": 0.004,
      "binHigh": 0.004999999999999999,
      "count": 0
    },
    {
      "binLow": 0.004999999999999999,
      "binHigh": 0.006,
      "count": 0
    },
    {
      "binLow": 0.006,
      "binHigh": 0.007000000000000001,
      "count": 0
    },
    {
      "binLow": 0.007000000000000001,
      "binHigh": 0.008000000000000002,
      "count": 0
    },
    {
      "binLow": 0.008000000000000002,
      "binHigh": 0.009,
      "count": 0
    },
    {
      "binLow": 0.009,
      "binHigh": 0.01,
      "count": 0
    },
    {
      "binLow": 0.01,
      "binHigh": 0.011000000000000001,
      "count": 1
    },
    {
      "binLow": 0.011000000000000001,
      "binHigh": 0.011999999999999999,
      "count": 0
    },
    {
      "binLow": 0.011999999999999999,
      "binHigh": 0.013,
      "count": 0
    },
    {
      "binLow": 0.013,
      "binHigh": 0.014,
      "count": 0
    },
    {
      "binLow": 0.014,
      "binHigh": 0.015000000000000001,
      "count": 0
    },
    {
      "binLow": 0.015000000000000001,
      "binHigh": 0.016,
      "count": 1
    },
    {
      "binLow": 0.016,
      "binHigh": 0.017,
      "count": 0
    },
    {
      "binLow": 0.017,
      "binHigh": 0.018000000000000002,
      "count": 0
    },
    {
      "binLow": 0.018000000000000002,
      "binHigh": 0.019000000000000003,
      "count": 0
    },
    {
      "binLow": 0.019000000000000003,
      "binHigh": 0.019999999999999997,
      "count": 1
    }
  ],
  "qqPairs": [
    {
      "theoretical": -1.2815515641401563,
      "observed": -1.2362678548580093
    },
    {
      "theoretical": -0.5244005132792953,
      "observed": -0.8499341502148813
    },
    {
      "theoretical": 0,
      "observed": 0.3090669637145023
    },
    {
      "theoretical": 0.5244005132792952,
      "observed": 0.6954006683576301
    },
    {
      "theoretical": 1.2815515641401563,
      "observed": 1.081734373000758
    }
  ]
}

Frequently asked questions

What does the Returns Distribution Analyzer methodology page document?

Skewness + excess kurtosis, Jarque-Bera test, QQ plot construction, tail-mass diagnostics, assumptions, and limitations for the Returns Distribution Analyzer. It states the formulas, assumptions, data sources, limitations, and reproducibility steps behind the Returns Distribution Analyzer, in the Finance category.

When was the Returns Distribution Analyzer methodology last reviewed?

This methodology was last reviewed on 2026-04-20. The matching tool is at https://aifinhub.io/returns-distribution-analyzer/.

Are the Returns Distribution Analyzer numbers reproducible?

Yes. This page embeds a worked example whose output is the verbatim result of running the shipped returns-distribution-analyzer engine on a fixed input; the embedded JSON is recomputed and diffed against the engine in CI, so the numbers cannot drift from the code.

Scope

Quantifies deviation of a univariate return series from a normal distribution — the precondition behind Sharpe, parametric VaR, and most mean-variance reasoning. Outputs both visual diagnostics (histogram, normal QQ plot) and numeric tests (moments, Jarque-Bera, tail mass at ±3σ).

Input format

date,returns
2024-01-02,0.0012
2024-01-03,-0.0005
...

Any CSV with a header row is accepted. The first non-date numeric column is analyzed. A date / timestamp / time column is optional and ignored for computation. Simple or log returns both work because all moments used here are location/scale invariant except for the mean.

Algorithms

Sample moments

μ = (1/n) · Σ x_i
σ = √[(1 / (n − 1)) · Σ (x_i − μ)²]
skew = (1/n) · Σ ((x_i − μ) / σ)³
excess_kurt = (1/n) · Σ ((x_i − μ) / σ)⁴ − 3

Skewness uses the population formula (divide by n), not n−1, consistent with the Jarque-Bera derivation below. Excess kurtosis subtracts 3 so that a normal distribution reads 0.

Jarque-Bera normality test

JB = (n / 6) · (skew² + excess_kurt² / 4)

Under H₀ (returns are normally distributed), JB is asymptotically χ²-distributed with 2 degrees of freedom. The p-value uses the closed-form χ²(2) CDF: P(X > x) = exp(−x / 2). The test converges slowly; for small samples (n < 200), interpret borderline p-values with caution.

Normal QQ plot

For each sorted observation at rank i (0-indexed), the empirical quantile position is p = (i + 0.5) / n. The theoretical quantile is Φ⁻¹(p), computed via the Beasley-Springer-Moro rational approximation. The plotted point is (Φ⁻¹(p), (x_(i) − μ) / σ). Points on the 45° line indicate normality; S-shaped deviation at the extremes is the classic fat-tail signature.

Tail mass

Fraction of observations beyond ±3σ from the sample mean. A normal distribution has ≈0.27% of mass beyond ±3σ. The tail-excess ratio is the observed total ÷ 0.27%; values > 2× flag the regime where Sharpe materially understates realised risk.

Assumptions + limitations

Stationarity. All tests assume the returns are drawn from a single fixed distribution. Regime shifts (e.g. pre- and post-Fed pivots) bias the moments. Split by regime for cleaner diagnostics.
Independence. Serial correlation inflates the effective sample size and weakens JB's asymptotic validity. For strongly autocorrelated series, deflate n heuristically or use a block-bootstrap p-value (not implemented here).
Sample size. A minimum of 30 observations is enforced. JB is asymptotic; tail-mass estimates are especially noisy below 200 observations.
Normal reference. The tool measures deviation from normality. Fat tails are a fact of financial returns — failing the test is usually correct, not surprising.
Single series. Multivariate non-normality (copula asymmetry, tail dependence) is not addressed. See the Correlation Matrix Visualizer for cross-asset structure.

Privacy

All parsing, moment computation, QQ-plot construction, and tail-mass measurement run in the browser. Nothing is uploaded. No cookies or tracking scripts.

References

Jarque, C. M., & Bera, A. K. (1987). "A Test for Normality of Observations and Regression Residuals." International Statistical Review 55(2).
Beasley, J. D., & Springer, S. G. (1977). "Algorithm AS 111: The Percentage Points of the Normal Distribution." Applied Statistics 26(1).
Cont, R. (2001). "Empirical Properties of Asset Returns: Stylized Facts and Statistical Issues." Quantitative Finance 1(2).
Taleb, N. N. (2020). Statistical Consequences of Fat Tails. STEM Academic Press.

Connects to

Risk-Adjusted Returns Calculator — a high Sharpe on a fat-tailed series is a warning, not a win.
Backtest Overfitting Score — Deflated Sharpe assumes moments; verify them first.
Correlation Matrix Visualizer — cross-series dependence after univariate diagnostics.

External resources

Changelog

2026-04-20 — Initial release with histogram, QQ plot, moments, Jarque-Bera, and 3σ tail-mass diagnostics.