Skip to main content
aifinhub

Playground

Agent Skill Tester for Markets

Test Anthropic Agent Skills for market extraction. Paste SKILL.md + sample 10-K excerpt + your key. See output, token cost, latency. Browser-only. Free.

Inputs
Prompt / input + API key
Runtime
2–15 s per model call
Privacy
Client-side · no upload
API key
BYO key (Anthropic · OpenAI · Google)
Methodology
Open →

Education · Not investment advice. BaFin/EU framework. Past performance does not indicate future results. Editorial standards Sponsor disclosure Corrections

Trust model · two boundaries

  • Anthropic call goes direct from your browser toapi.anthropic.com. The API key never reaches aifinhub.
  • Scoring imports the/engines/agent-skill-tester.jsmodule and scores the prompt + response in your browser. No key, no model id, no network call. Deterministic checks, no LLM.

1 · Config

2 · Skill definition (SKILL.md)

3 · Sample input document

4 · Pass/fail rubric (one criterion per line)

Recognised: valid_json, has_field:<path>, field_type:<path>:<type>, contains:<text>, regex:<pattern>, no_apology, numbers_grounded_in_prompt, and more.

How to use

Step-by-step

Full calculator guide →
  1. 1

    Paste your SKILL.md definition into the editor. The schema spec must be valid JSON Schema for strict-mode validation to pass.

  2. 2

    Paste sample input that matches your skill's input schema. Use a realistic example, not a minimal one.

  3. 3

    Enter your Anthropic API key. The key stays in browser memory only — not persisted, not logged.

  4. 4

    Click Run. Watch the structured output, token cost (input + output × current pricing), and end-to-end latency.

  5. 5

    Re-run several times. Variance in outputs is informative — high variance suggests the prompt is under-constrained.

For agents

Use in an agent

Same math, same result shape as the UI above — as a static ES module. No HTTP request, no auth, no rate limit.

import { compute } from "https://aifinhub.io/engines/agent-skill-tester.js";

Contract: /contracts/agent-skill-tester.json Full agent guide →

Glossary references

Terms used by this tool

All glossary →

Questions people ask next

FAQ

What's a SKILL.md?

Anthropic's structured-skill spec: a markdown file with a name, description, input/output schema, and worked examples. Skills bundle a small repeatable agent capability (e.g., 'extract a 10-K risk factor') in a portable format. The tester loads your SKILL.md and runs it against sample input.

Why does the tester need my own API key?

Calls go to Anthropic's API directly from your browser. The tool never sees or proxies the key. This keeps cost on your account, makes rate limits predictable, and avoids the privacy issue of routing finance prompts through a third-party proxy. Your key stays in browser memory only — it's not persisted to localStorage.

What does the tester measure?

Three things: structured-output compliance (does the model return valid JSON matching the schema?), token cost (input + output tokens × current pricing), and end-to-end latency. Repeated runs show variance — useful for diagnosing flaky outputs.

Why does my SKILL.md sometimes fail validation?

The tester enforces strict-mode JSON schema validation. Common failures: missing required fields, nested objects with no properties defined, enum values outside the declared list. The validation error shows the exact field path. Real-world agents are more forgiving but also more inconsistent — strict-mode catches issues that bite later.

Can I test multi-turn skills?

The current version is single-turn only — input → output. Multi-turn skill testing (where the model asks clarifying questions or takes multiple steps) is on the roadmap. For now, multi-turn skills need to be tested in a loop programmatically against the API directly.

Complementary tools

Planning estimates only — not financial, tax, or investment advice.