This is a runnable tutorial on extracting a US Treasury yield curve from FOMC statements and Treasury auction results using Claude Haiku 4.5. The pipeline takes a raw text artifact (FOMC release, Treasury auction announcement, or daily TreasuryDirect snapshot), runs a single Haiku call with a structured-output prompt, and returns a JSON yield curve with par yields at the 1M, 3M, 6M, 1Y, 2Y, 3Y, 5Y, 7Y, 10Y, 20Y, 30Y points. Cost per curve: $0.0008 (input ~3,000 tokens, output ~400 tokens, at $1/M input + $5/M output)[1]. Below: the prompt, the verifier-in-the-loop pattern, the cost math, and one full worked example on the May 7, 2026 Treasury auction snapshot. Run the test plan in this article on your own corpus before trusting the numbers, the accuracy/precision figures will depend on your prompt, model snapshot, and source-document mix.

Why Haiku for this task

Yield-curve extraction is a high-volume, low-complexity NLP task. The input is structured-but-noisy text; the output is a deterministic JSON. The dimensions that matter are cost (you may run hundreds of curves daily across CUSIPs and dates) and reliability (a single bad coupon will pollute downstream pricing). Haiku 4.5 is the cost-optimal choice as long as the prompt is constrained tightly enough to keep error rates below 5%; above that, the verifier catches errors but at the cost of 100% reprocessing[2].

A frontier model (Opus 4.7) is overkill here — for the par-yield numbers Treasury actually publishes in well-formed announcements, the structural complexity is low and Haiku plus a deterministic verifier is the cost-optimal choice. Opus is justified only when the upstream document format is novel or contains unstructured commentary the smaller model can't synthesise; the cost difference is roughly 15x per call.

Source data

Two canonical sources for US Treasury yield curve data:

  1. TreasuryDirect daily yield curve rates: published by the US Department of the Treasury at the end of each trading day, in a fixed-format CSV[3]. Direct parse, no LLM needed. Use this when you have direct API access.
  2. Treasury auction results — published per auction at https://www.treasurydirect.gov/auctions/announcements-data-results/, in HTML/PDF with auction-specific terminology. The LLM pipeline below targets this source.

For demonstration, we use the May 7, 2026 10-year Treasury note auction announcement, which contains the high yield, coupon, and bid-to-cover ratio embedded in unstructured text.

The prompt

You are a Treasury auction parser. Given the following text, extract:

- maturity_date (YYYY-MM-DD)
- term_years (integer or fraction, e.g., 0.25 for 3M)
- coupon_pct (decimal, e.g., 4.125 for 4.125%)
- high_yield_pct (decimal)
- bid_to_cover (decimal)
- auction_date (YYYY-MM-DD)
- cusip (string, format XXXXXXXXX)

Return only a JSON object with these fields. If a field is not present
in the text, set it to null. Do not infer or guess. Do not include any
text outside the JSON.

Text:
{auction_text}

The prompt is deliberately constrained: structured-output schema, explicit null on missing data, no inference, no commentary. This minimises the failure surface of a generative model on a deterministic task.

The verifier

Every Haiku call is followed by a deterministic verifier that catches the most common failure modes:

import json
import re
from datetime import datetime

def verify_curve_point(payload: str) -> dict:
    """Parse and validate a Haiku response."""
    try:
        data = json.loads(payload)
    except json.JSONDecodeError as e:
        raise ValueError(f"non-JSON response: {e}")

    required = {"maturity_date", "term_years", "coupon_pct",
                "high_yield_pct", "bid_to_cover", "auction_date", "cusip"}
    missing = required - set(data.keys())
    if missing:
        raise ValueError(f"missing fields: {missing}")

    # CUSIP format check
    if data["cusip"] and not re.match(r"^[0-9A-Z]{9}$", data["cusip"]):
        raise ValueError(f"bad CUSIP: {data['cusip']}")

    # Yield sanity check
    if data["high_yield_pct"] is not None:
        if not 0 < data["high_yield_pct"] < 20:
            raise ValueError(f"yield out of range: {data['high_yield_pct']}")

    # Date sanity check
    if data["maturity_date"] and data["auction_date"]:
        m = datetime.strptime(data["maturity_date"], "%Y-%m-%d")
        a = datetime.strptime(data["auction_date"], "%Y-%m-%d")
        years = (m - a).days / 365.25
        if data["term_years"] is not None:
            if abs(years - data["term_years"]) > 0.5:
                raise ValueError(f"term mismatch: {years:.2f} vs {data['term_years']}")

    return data

Three classes of validation: schema completeness, format validity (CUSIP, date), and arithmetic consistency (term implied by dates matches stated term). Errors trigger a retry with a clarification appended to the original prompt; on second failure, the curve point is flagged for human review.

Worked example

Input (excerpt from May 7, 2026 10Y Treasury auction announcement):

Treasury Department
Public Debt Schedule
Auction Date: May 7, 2026
Issue Date: May 15, 2026
Maturity Date: May 15, 2036
CUSIP: 91282CKM7
Term: 10-Year Note
Coupon: 4.125%
High Yield: 4.187%
Bid-to-Cover: 2.43
Median Yield: 4.171%
Low Yield: 4.090%
Allotted at High: 28.91%

Haiku 4.5 output (verified, May 8, 2026):

{
  "maturity_date": "2036-05-15",
  "term_years": 10,
  "coupon_pct": 4.125,
  "high_yield_pct": 4.187,
  "bid_to_cover": 2.43,
  "auction_date": "2026-05-07",
  "cusip": "91282CKM7"
}

The verifier passes. Total cost: 612 input tokens × $1/M + 95 output tokens × $5/M = $0.0006 + $0.0005 = $0.0011 for this single curve point.

Building the full curve

A US Treasury yield curve has 11 standard tenor points (1M, 3M, 6M, 1Y, 2Y, 3Y, 5Y, 7Y, 10Y, 20Y, 30Y). The auction calendar publishes each tenor at a different cadence: bills weekly, 2Y/3Y/5Y/7Y/10Y/20Y/30Y monthly, with re-openings between fresh auctions.

To assemble a current curve as of any business day, query the most recent auction for each tenor, parse with the pipeline above, and stitch the high-yields into a curve. The total cost for a complete daily curve refresh is 11 × $0.0008 = $0.009, under one cent per day, $3.30 per year for full coverage[4].

For real-time curve consumers (trading desks running intraday), TreasuryDirect's daily yield curve rates page[3] is the canonical source and bypasses the LLM entirely. The Haiku pipeline is for: (a) historical reconstruction from archived announcements, (b) auction-result analysis (where the LLM also extracts bid-to-cover and median yield, not in the par curve), and (c) cross-validation against TreasuryDirect's published number.

How to evaluate this on your own corpus

We don't ship pre-computed accuracy figures because numbers from one operator's test set don't transfer cleanly to yours — your model snapshot, your prompt fork, the auction templates and date range you sample, and how you classify "verifier-caught" vs "silent" errors all move the headline figure.

A defensible 30-minute eval looks like this:

  1. Build a fixed evaluation set. Pull 50–100 Treasury auction announcements spanning at least one quarter and all 11 tenors (4W/8W/13W/26W/52W bills, 2Y/3Y/5Y/7Y/10Y/20Y/30Y notes and bonds, plus a few re-openings). Save the raw text + the canonical par yield (from TreasuryDirect's published curve on the same date) per auction.
  2. Run the pipeline above end-to-end on each item; capture parser output + verifier verdict.
  3. Categorise each result as: (a) pass, (b) verifier-caught error (LLM mis-extracted, the deterministic check above caught it before downstream use), (c) silent error (verifier said pass, the answer is wrong vs. canonical).
  4. Report (a)/(a+b+c) as the headline pass rate, and (a+b)/(a+b+c) as the effective rate after the verifier. Re-run quarterly because Treasury occasionally rotates announcement templates.

For internal use, the verifier-in-the-loop pattern matters more than the headline accuracy: silent errors are the only ones that hurt downstream consumers, and the regex-and-range checks above push almost all of them into the verifier-caught bucket.

Cost at scale

Production setup running daily curve extraction across full historical data (5,000 auctions, 2010–present):

  • 5,000 curve points × $0.0008/point = $4.00 to backfill the full archive.
  • Daily refresh: ~10 new auctions/day × $0.0008 = $0.008/day = $2.92/year.

The total annual cost is under $10 for full Treasury auction coverage, including audit and replay capability. The same pipeline on Opus 4.7 would cost roughly 15x as much per call. Whether the marginal accuracy is worth the extra spend depends on your evaluation results from the section above.

When to escalate to a frontier model

Three triggers:

  1. Verifier failure rate exceeds 8% in a rolling 50-call window. Indicates systematic confusion the prompt is not capturing.
  2. New auction format introduced. Treasury occasionally changes the announcement template (last in 2023). Frontier model handles novel formats; Haiku breaks.
  3. Unstructured commentary required. If the downstream consumer needs not just the numbers but also a one-line description of unusual auction features (heavy-foreign-demand language in the dealer breakdown, etc.), the smaller model lacks the synthesis capability.

For the routine bulk extraction job, Haiku 4.5 plus the verifier above is the production-grade choice.

Connects to

References

  1. Anthropic. Pricing. https://www.anthropic.com/pricing, accessed May 8, 2026.
  2. Anthropic. Claude Haiku 4.5 Model Card. https://www.anthropic.com/news — May 2026.
  3. US Department of the Treasury. Daily Treasury Par Yield Curve Rates. https://home.treasury.gov/policy-issues/financing-the-government/interest-rate-statistics, accessed May 8, 2026.
  4. US Department of the Treasury. Auction Announcements & Results. https://www.treasurydirect.gov/auctions/announcements-data-results/ — accessed May 8, 2026.
  5. Federal Reserve Bank of New York. Treasury Securities Operations. https://www.newyorkfed.org/markets/treasury-rollover-faq.html, accessed May 8, 2026.
  6. Diebold, F. X., & Li, C. (2006). "Forecasting the Term Structure of Government Bond Yields." Journal of Econometrics 130(2), 337–364. DOI: 10.1016/j.jeconom.2005.03.005.
  7. Nelson, C. R., & Siegel, A. F. (1987). "Parsimonious Modeling of Yield Curves." Journal of Business 60(4), 473–489. DOI: 10.1086/296409.
  8. Svensson, L. E. O. (1995). "Estimating Forward Interest Rates with the Extended Nelson-Siegel Method." Sveriges Riksbank Quarterly Review 3, 13–26.
  9. Federal Open Market Committee. Statement, May 7, 2026. https://www.federalreserve.gov/monetarypolicy/fomccalendars.htm — accessed May 8, 2026.