Skip to content

Cost Model β€” Accessibility Skill

What it costs to run one accessibility audit + ACR through the chat skill, and how to keep the numbers current as models and prices shift.

There are two distinct cost buckets that hit different bills:

  1. Platform side (your Anthropic API bill) β€” server-side LLM calls made by api.theaccessible.org.
  2. User side (the user’s Claude bill) β€” orchestration tokens consumed by Claude Desktop / Claude.ai / Claude Code while running the skill.

These are paid by different accounts. Don’t conflate them when modeling unit economics.


Per-page cost β€” platform side

The URL scan pipeline is almost entirely deterministic. Server-side LLM cost per page is just the AI pre-grader.

ComponentLLM?ModelNotes
URL fetch executor (workers/batch/src/url-fetch-executor.ts)noβ€”Puppeteer + axe-core + cms-detector
Auto-fix loop (workers/api/src/services/axe-fixer.ts)noβ€”Deterministic regex-based fixes; despite the name, no LLM call
AI pre-grader (workers/api/src/services/verification-prograder.ts)yesHaiku 4.5One call per not-verified criterion
ACR composer (workers/api/src/services/acr-composer.ts)noβ€”HTML template + weasyprint sidecar

So per scan, the only LLM bill is the pre-grader.

Pre-grader cost arithmetic

VariableValueSource
Modelclaude-haiku-4-5-20251001verification-prograder.ts:19
Input rate$1 / MTokAnthropic published rate (Haiku 4.5 release pricing)
Output rate$5 / MTokAnthropic published rate
Calls per page11–18 (typ. 15)One per not-verified criterion. Range observed in real scans.
Input tokens / call~1,500Artifact JSON (truncated to 8KB max, typical 1–3KB) + system prompt + criterion description
Output tokens / call~150Structured JSON verdict + brief reasoning

Cost per call: 1,500 Γ— $1/M + 150 Γ— $5/M = ~$0.0023

Cost per page (15 calls): ~$0.034

At scale

Pages / monthPlatform LLM cost
100~$3.40
1,000~$34
10,000~$340
100,000~$3,400

These are LLM costs only β€” they don’t include Puppeteer compute, R2 storage, weasyprint sidecar, or other infra. Those are tracked separately (see aws-deploy-next-steps.md and the cost-tracking follow-up in accessibility-skill.md).


Per-walkthrough cost β€” user side

A full skill walkthrough (scan β†’ verify ~15 items conversationally β†’ propose fixes β†’ generate ACR) consumes meaningful tokens in the user’s Claude account because tool result payloads (HTML, queue items, conformance reports) inflate the input side fast.

Standard rates (current published list)

ModelInput ($/MTok)Output ($/MTok)
Haiku 4.5$1$5
Sonnet 4.6$3$15
Opus 4.7$15$75

Walkthrough scenarios

PatternInput tokensOutput tokensSonnet 4.6Opus 4.7
Single page, full walkthrough (verify all + ACR)~200K~50K~$1.35~$6.75
Scan + summary only (no verification)~30K~5K~$0.17~$0.83
Scan + auto-approve high-confidence + generate~80K~15K~$0.47~$2.30

Most Claude Pro/Max users won’t see this directly β€” it eats into their monthly cap. API users see it on their bill.


Phase 6 (screenshots) β€” incremental cost

Per-element JPEG screenshots inlined into acr.queue responses for visual verification.

  • Server side: zero new LLM calls (Puppeteer + R2 only).
  • User side: 15 items Γ— ~10KB JPEG β‰ˆ 150KB total per response, ~6K extra input tokens. Adds:
    • ~$0.018 / walkthrough on Sonnet 4.6
    • ~$0.090 / walkthrough on Opus 4.7

Negligible compared to the visual-context value for criteria like 1.1.1 (alt text), 1.4.1 (color use), 1.4.3 (contrast).


Pricing implications

If you’re modeling a per-audit charge, the platform cost floor is ~$0.034 LLM + small infra β‰ˆ a few cents. Reasonable per-audit pricing tiers:

TierPer-audit priceLLM cost as %Notes
Self-serve$0.50–$22–7%Dashboard or skill
Pro / agency$5–$15<1%Bulk + sign-off authority
Enterprise$50–$200<0.1%SLA + dedicated review

The platform LLM cost is too small to gate aggressive pricing. Compute (Puppeteer rendering on .4 / EC2) and the cost of human review time on your side are the real cost drivers, not the AI bill.


Optimization opportunities

OpportunityMechanismEstimated savings
Prompt cachingAnthropic’s prompt cache β€” system prompt + criterion descriptions are identical across all pre-grader calls~80% of input cost on cached prefix β†’ drops pre-grader to ~$0.01/page
Batch APIAnthropic Batch API at 50% discount for non-urgent pre-gradingHalves pre-grader cost; trades latency (24h vs seconds)
Conditional pre-gradingSkip the pre-grader for criteria the user has decided in a prior scan of the same URLSaves ~30–50% of pre-grader calls on repeat scans
Confidence threshold tuningCurrently β‰₯0.85 && verdict !== 'partial' auto-decides. Lowering to 0.80 would auto-decide more items without re-promptingMarginal pre-grader cost change; bigger UX win

None of these are wired up yet. Worth revisiting once you have β‰₯1,000 pages/month of real traffic to optimize against.


Keeping this document current

  • When the pre-grader model changes: update verification-prograder.ts:19 AND the rates in this doc’s pre-grader table.
  • When Anthropic adjusts published rates: update both rate tables. Re-derive the per-page number.
  • When real traffic data is available: replace the typed-15-criteria estimate with measured medians from human_verification row counts per job_id. SQL:
    SELECT percentile_cont(0.5) WITHIN GROUP (ORDER BY n) AS median_calls
    FROM (
    SELECT job_id, COUNT(*) AS n
    FROM human_verification
    WHERE source IN ('ai-auto', 'ai-suggested-human-confirmed', 'ai-suggested-human-overrode')
    GROUP BY job_id
    ) t;
  • When prompt caching is enabled: revisit β€œOptimization opportunities” β€” savings should land in the main per-page table.

  • docs/admin/accessibility-skill.md β€” full architecture and operations reference
  • apps/web/content/guides/claude-desktop-skill.md β€” end-user install + usage guide