AI Costs

Three ways to run AI. Real costs from real benchmarks.

Claude Desktop (flat subscription), Claude Code (allowance-based), or the built-in API gateway (pay-per-token). Each option connects to the same 9 MCP servers and 254 GRC tools. The cost model is the only difference.

Choose your model

Subscription vs pay-per-token

Claude Desktop

Subscription

Cost

$20/mo Pro · $100/mo Max

MCP servers run as local stdio processes. Claude Desktop spawns them directly. You get flat-rate usage within Anthropic's fair-use limits.

Best for

Daily GRC queries, routine risk checks, quick lookups

Trade-off

Rate-limited during peak usage. No autonomous workflows, no council, no approval UI.

Claude Code

Subscription + Allowance

Cost

$20/mo (Max plan included)

Claude Code reads .mcp.json from the project root and connects to all 9 MCP servers automatically. Usage counted against your monthly allowance.

Best for

Development, exploration, cross-domain queries with all 9 servers

Trade-off

Same rate limits as Claude Desktop. Usage allowance may cap heavy sessions.

Built-in Gateway (API)

Pay-per-token

Cost

No subscription — pay only for what you use

The RiskReady gateway routes queries to domain MCP servers, runs the AI Council, executes scheduled workflows, and manages the approval queue — all via the Anthropic Messages API with your own API key.

Best for

Production use, autonomous workflows, council deliberations, approval-gated mutations

Trade-off

Requires ANTHROPIC_API_KEY. Cost scales with usage and model choice.

API pricing

Per-token rates (Anthropic Messages API)

Model

Input / 1M

Output / 1M

Use case

Claude Haiku 4.5

$0.80

$4.00

Best value

Claude Sonnet 4.6

$3.00

$15.00

Balanced

Claude Opus 4.6

$15.00

$75.00

Maximum quality

Real benchmarks

Actual costs from a live GRC database.

Tested against ClearStream Payments: 15 risks, 40 controls, 30 scenarios, 8 KRIs, 12 active incidents. These are real token counts, not estimates.

Single query

$0.007

“Show me the top risks”

Input tokens3,999

Output tokens948

Tool calls1

ModelHaiku 4.5

6-agent council (Haiku)

$0.19

“Comprehensive security posture review”

Total tokens119,970

Tool calls32

Tools used by2 of 5

Duration~2 minutes

QualityMedium — some generic analysis

6-agent council (Opus)

$10.08

“Comprehensive security posture review”

Total tokens~478,000

Tool calls~102

Tools used by5 of 5

Duration~4 minutes

QualityHigh — all data-backed with record IDs

Monthly estimates

API costs by usage pattern

Profile	Usage	Haiku	Sonnet	Opus	Desktop
Light user	10 queries/day, no council	~$2/mo	~$8/mo	~$40/mo	$20/mo (Pro)
Active user	30 queries/day, 2 councils/week	~$8/mo	~$30/mo	~$150/mo	$20/mo (Pro)
Power user	50 queries/day, daily council, scheduled workflows	~$15/mo	~$60/mo	~$300/mo	$100/mo (Max)

Recommendation: Start with Claude Desktop Pro ($20/mo) for daily queries. Switch to API with Haiku for autonomous workflows and scheduled runs. Use Opus only for board-level reports and audit preparation.

Built-in savings

How the gateway keeps costs down

Tool search (96% savings)

defer_loading lets Claude discover tools on demand instead of loading all 254 schemas — input drops from 228K to 8K tokens per request.

Prompt caching (90% discount)

System prompts are cached for 5 minutes. Repeated queries within a session get 90% off the cached portion.

Council batching

Council members run in pairs of 2 to stay within memory limits, with per-member token tracking.

Model selection per task

Use Haiku for daily queries ($0.007), Sonnet for analysis, Opus only for board reports and audit prep ($10).

Full benchmark data

Run your own benchmarks against your data.

The gateway logs per-member token usage for every council session. Deploy the demo, connect your API key, and see exactly what your workloads cost.

Run the Demo Full Benchmark Report