aifinhub

Generator

SEC Filing Chunk Optimizer

SEC filing chunk sizing + 10-K chunking cost calculator. Pick archetype, chunk size, overlap, strategy, and embedding model. Browser-only. Free.

Inputs
Configuration
Runtime
Instant
Privacy
Client-side · no upload
API key
Not required
Methodology
Open →

1 · Configure chunk strategy

1,024 tok
15%
100

Total chunks

138

Avg tokens/chunk

1,021

min 1,024 · max 1,024

Ingest cost (once)

$0.002818

text-embedding-3-small

Query cost (100 re-embeds)

$0.000080

Tokens embedded

140,898

2 · Results

Strategy note · Respects Items / section headers / speaker turns. Preserves table blocks by keeping heading+table together. Chunk sizes are uneven but semantically clean.

No structural warnings at these settings. Still run a retrieval eval before production — heuristics can't replace ground truth.

Archetype reference: Form 10-K business + risk + MD&A + financials. ~12 Items. Dense tables in Item 7 / 8.

3 · Compare strategies (same archetype + chunk size)

StrategyChunksAvg tokMin / MaxIngest costTradeoff
structuralselected1381,0211,024 / 1,024$0.002818Highest fidelity; uneven chunk sizes.
recursive1381,021614 / 1,024$0.002818Cheap + deterministic; blind to tables.
semantic1321,061409 / 1,433$0.002801Coherent prose groups; variable sizes, higher compute.

How the estimate works

stride        = chunk_size × (1 − overlap_pct)
base_count    = ceil(total_tokens / stride)
structural    → max(boundary_count, base_count)
recursive     → base_count
semantic      → ceil(base_count × 0.95), wider size variance
ingest_cost   = tokens_embedded × $/M_tokens
query_cost    = (40 × n_queries) × $/M_tokens

Pricing verified 2026-04-23. See methodology for archetype sources and the table-splitting pitfall.

Complementary tools

Planning estimates only — not financial, tax, or investment advice.