Skip to content

Anthropic Batch API Integration

Why Batch Mode Exists

Large PDF conversions (textbooks, reports, legal filings) can be 50–200+ pages. Each page is sent to Claude as an individual vision request. At Claude Sonnet pricing (~$3/$15 per MTok in/out), a 100-page document costs roughly $15 in real-time mode.

The Anthropic Message Batches API processes requests at 50% of the standard token cost in exchange for asynchronous delivery β€” results are guaranteed within 24 hours, typically arriving within 1 hour. Batch requests also run in a separate queue that does not count against normal Messages API rate limits.

For a 100-page document, batch mode reduces the conversion cost from ~$15 to ~$7.50.

How It Works

Conversion Flow

User uploads PDF (45 pages)
β”‚
β–Ό
POST /api/convert/:fileId { realTime: false }
β”‚
β”œβ”€ pageCount > 10 && !realTime ?
β”‚ β”‚
β”‚ β–Ό YES
β”‚ Split PDF into 45 single-page PDFs
β”‚ Build 45 batch requests (custom_id = "page-1" … "page-45")
β”‚ Submit to anthropic.messages.batches.create()
β”‚ Store KV record: batch:{batchId} β†’ { fileId, userId, userEmail, … }
β”‚ Set file status β†’ batch_pending
β”‚ Return { batchMode: true, estimatedReadyAt }
β”‚
β–Ό NO (≀10 pages or realTime: true)
Normal real-time processConversion()

Cron Poller (every 5 minutes)

Cloudflare Cron Trigger ── */5 * * * *
β”‚
β–Ό
List all KV keys with prefix "batch:"
β”‚
For each batch record:
β”‚
β–Ό
Retrieve batch status via anthropic.messages.batches.retrieve(batchId)
β”‚
β”œβ”€ processing_status !== 'ended' β†’ skip, check again in 5 min
β”‚
β–Ό 'ended'
Stream JSONL results via anthropic.messages.batches.results(batchId)
Sort by custom_id (page-1, page-2, …)
Assemble per-page HTML into full document
β”‚
β–Ό
Run post-processing pipeline:
1. structurePages() β€” headers, footers, page sections
2. optimizeDeterministic() — table headers, br→semantic, SVG aria
3. enhanceAccessibility() β€” title, lang, ARIA
4. validateAndFix() β€” WCAG AA auto-fix
β”‚
β–Ό
Save HTML to R2
Update file metadata β†’ status: completed
Send email notification (Resend)
Send Web Push notification (if subscribed)
Delete batch KV record

Single-Pass Constraint

Batch mode is single-pass only. The normal agentic-vision pipeline uses an iterative screenshot-comparison loop: render the HTML, screenshot it, compare to the original PDF, and refine. This loop requires browser rendering between Claude calls, which is impossible in batch mode since all requests run independently.

Batch requests are submitted with maxIterations = 0 (no refinement). The initial conversion prompt is the same high-quality prompt used in the iterative pipeline, so single-pass quality is still good for most documents. Users who need maximum fidelity should use real-time mode.

Key Files

FilePurpose
workers/api/src/services/batch-vision-converter.tsSplits PDF, builds batch requests, submits to Anthropic
workers/api/src/services/batch-result-assembler.tsStreams JSONL results, sorts by page, assembles HTML
workers/api/src/routes/batch-cron.tsCron handler β€” polls batches, runs post-processing, notifies user
workers/api/src/routes/convert.tsRouting logic β€” batch vs real-time based on page count and realTime flag
workers/api/src/services/email.tssendBatchCompletionEmail() β€” notifies user when batch finishes
workers/api/src/services/web-push.tsVAPID-based push notifications for instant browser alerts
workers/api/src/routes/push.tsPush subscription endpoints
apps/web/src/app/dashboard/page.tsxBatch toggle UI, batch_pending status badge, adaptive polling

KV Data Schema

Batch Job Record

Key: batch:{anthropicBatchId}

{
"fileId": "abc123",
"userId": "user456",
"userEmail": "user@example.com",
"filename": "annual-report.pdf",
"batchId": "msgbatch_01JxYz...",
"pageCount": 45,
"options": { "highFidelity": false },
"createdAt": "2026-02-27T12:00:00Z"
}

This record is created when the batch is submitted and deleted after the cron successfully processes the results.

User Experience

Dashboard Behavior

ConditionWhat the User Sees
File ≀ 10 pagesNormal real-time conversion (no batch option applies)
File > 10 pages, batch mode on (default)Convert β†’ β€œBatch Queued” badge with clock icon, β€œwe’ll email you when it’s ready”
File > 10 pages, batch mode offConvert β†’ normal real-time processing with progress bar
Batch completesStatus changes to β€œCompleted” on next poll (30s interval), email arrives, push notification if subscribed

Polling Intervals

  • Real-time processing files: poll every 3 seconds (fast feedback)
  • Batch-pending files only: poll every 30 seconds (no point polling faster β€” server checks every 5 min)

Cost Comparison

ModePer-Page Cost (Sonnet)100-Page DocumentDelivery
Real-time (iterative, 4 passes)~$0.60~$60Minutes
Real-time (single-pass)~$0.15~$15Minutes
Batch (single-pass, 50% off)~$0.075~$7.50~1 hour

Batch mode is the default for documents over 10 pages because the cost savings are significant and most users do not need instant results for large documents.

Deployment

VAPID Keys (for Web Push)

Generate a key pair:

Terminal window
npx web-push generate-vapid-keys

Set as Cloudflare Worker secrets:

Terminal window
echo "YOUR_PUBLIC_KEY" | npx wrangler secret put VAPID_PUBLIC_KEY --env production
echo "YOUR_PRIVATE_KEY" | npx wrangler secret put VAPID_PRIVATE_KEY --env production

Cron Trigger

The cron trigger is defined in wrangler.toml and auto-registers on deploy:

[triggers]
crons = ["*/5 * * * *"]

To test locally:

8787/__scheduled
npx wrangler dev --test-scheduled

Failure Handling

  • Batch submission fails: Falls through to normal real-time processing (graceful degradation).
  • Individual page fails in batch: A placeholder is inserted (β€œPage N could not be converted”) and the rest of the document is assembled normally.
  • Cron processing fails: The batch KV record is not deleted, so it will be retried on the next cron run (5 minutes later).
  • Batch never completes: Anthropic guarantees results within 24 hours. If a batch record persists beyond 24 hours, it should be investigated manually.

Limitations

  1. No iterative refinement β€” batch mode is single-pass. Complex layouts (multi-column, heavy math) may have lower fidelity than iterative real-time mode.
  2. No real-time progress β€” the user sees β€œBatch Queued” with no percentage updates until the batch completes.
  3. Minimum latency ~5 minutes β€” even if Anthropic returns results instantly, the cron only runs every 5 minutes.
  4. No webhook β€” the Anthropic Batch API does not support webhooks, so polling is the only option.