AI Costs

Three ways to run AI. Real costs from real benchmarks.

Claude Desktop (flat subscription), Claude Code (allowance-based), or the built-in API gateway (pay-per-token). Each option connects to the same 9 MCP servers and 254 GRC tools. The cost model is the only difference.

Choose your model

Subscription vs pay-per-token

Claude Desktop

Subscription

Cost

$20/mo Pro · $100/mo Max

MCP servers run as local stdio processes. Claude Desktop spawns them directly. You get flat-rate usage within Anthropic's fair-use limits.

Best for

Daily GRC queries, routine risk checks, quick lookups

Trade-off

Rate-limited during peak usage. No autonomous workflows, no council, no approval UI.

Claude Code

Subscription + Allowance

Cost

$20/mo (Max plan included)

Claude Code reads .mcp.json from the project root and connects to all 9 MCP servers automatically. Usage counted against your monthly allowance.

Best for

Development, exploration, cross-domain queries with all 9 servers

Trade-off

Same rate limits as Claude Desktop. Usage allowance may cap heavy sessions.

Built-in Gateway (API)

Pay-per-token

Cost

No subscription — pay only for what you use

The RiskReady gateway routes queries to domain MCP servers, runs the AI Council, executes scheduled workflows, and manages the approval queue — all via the Anthropic Messages API with your own API key.

Best for

Production use, autonomous workflows, council deliberations, approval-gated mutations

Trade-off

Requires ANTHROPIC_API_KEY. Cost scales with usage and model choice.

API pricing

Per-token rates (Anthropic Messages API)

Model
Input / 1M
Output / 1M
Use case
Claude Haiku 4.5
$0.80
$4.00
Best value
Claude Sonnet 4.6
$3.00
$15.00
Balanced
Claude Opus 4.6
$15.00
$75.00
Maximum quality

Real benchmarks

Actual costs from a live GRC database.

Tested against ClearStream Payments: 15 risks, 40 controls, 30 scenarios, 8 KRIs, 12 active incidents. These are real token counts, not estimates.

Single query

$0.007

Show me the top risks

Input tokens3,999
Output tokens948
Tool calls1
ModelHaiku 4.5

6-agent council (Haiku)

$0.19

“Comprehensive security posture review”

Total tokens119,970
Tool calls32
Tools used by2 of 5
Duration~2 minutes
QualityMedium — some generic analysis

6-agent council (Opus)

$10.08

“Comprehensive security posture review”

Total tokens~478,000
Tool calls~102
Tools used by5 of 5
Duration~4 minutes
QualityHigh — all data-backed with record IDs

Monthly estimates

API costs by usage pattern

ProfileUsageHaikuSonnetOpusDesktop
Light user10 queries/day, no council~$2/mo~$8/mo~$40/mo$20/mo (Pro)
Active user30 queries/day, 2 councils/week~$8/mo~$30/mo~$150/mo$20/mo (Pro)
Power user50 queries/day, daily council, scheduled workflows~$15/mo~$60/mo~$300/mo$100/mo (Max)

Recommendation: Start with Claude Desktop Pro ($20/mo) for daily queries. Switch to API with Haiku for autonomous workflows and scheduled runs. Use Opus only for board-level reports and audit preparation.

Built-in savings

How the gateway keeps costs down

Tool search (96% savings)

defer_loading lets Claude discover tools on demand instead of loading all 254 schemas — input drops from 228K to 8K tokens per request.

Prompt caching (90% discount)

System prompts are cached for 5 minutes. Repeated queries within a session get 90% off the cached portion.

Council batching

Council members run in pairs of 2 to stay within memory limits, with per-member token tracking.

Model selection per task

Use Haiku for daily queries ($0.007), Sonnet for analysis, Opus only for board reports and audit prep ($10).

Full benchmark data

Run your own benchmarks against your data.

The gateway logs per-member token usage for every council session. Deploy the demo, connect your API key, and see exactly what your workloads cost.