Skip to content

Node.js Docker Server for Heavy Processing

The Node.js server offloads CPU/memory-intensive PDF operations (conversion, export, remediation) from the Cloudflare Worker to a Docker container running native Puppeteer. This eliminates CF Browser Rendering rate limits and provides more headroom for long-running jobs.

Architecture

Browser β†’ CF Worker β†’ nodeProxyMiddleware
↓ (heavy POST + healthy)
CF Tunnel β†’ 10.1.1.3:8790
↓
Node server (same Hono routes)
β€’ storage.objects β†’ R2 via S3 API
β€’ storage.kv β†’ CF KV via REST API
β€’ storage.browser β†’ Native Puppeteer
↓
Response streams back through CF Worker

Fallback: If the Node server is unhealthy or unreachable, the CF Worker runs the route itself using CF Browser Rendering. The circuit breaker in nodeProxyMiddleware opens after 2 consecutive failures, cooling down for 30 seconds before retrying.

Proxied Routes

Only heavy POST requests are forwarded. Everything else stays on the CF Worker.

PatternDescription
POST /api/convert/:fileIdPDF-to-HTML conversion
POST /api/export/:fileId/pdfHTML-to-PDF export
POST /api/remediate/htmlSingle HTML remediation
POST /api/remediate/batchBatch remediation
POST /api/remediate/urlURL scan + remediation

Storage Strategy

The Node server accesses the same data as the CF Worker β€” same R2 bucket, same KV namespace β€” through different APIs.

NeedCF WorkerNode Server
Object storage (files, ZIPs)R2 bindingR2 via S3-compatible API
Key-value (metadata, status)KV bindingCF KV via REST API
Browser (PDF gen, screenshots)CF Browser RenderingNative Puppeteer

Why this works

  • R2 natively supports the S3 protocol. The R2ObjectStorage class uses @aws-sdk/client-s3 pointed at https://{accountId}.r2.cloudflarestorage.com.
  • KV is accessible via Cloudflare’s REST API. The CloudflareKvRestStorage class calls https://api.cloudflare.com/client/v4/accounts/{id}/storage/kv/namespaces/{ns}/values/{key}.
  • Puppeteer runs natively in the Docker container (the base image ships Chromium). Reuses the existing AwsBrowserProvider from providers/aws.ts.

Key Files

FilePurpose
workers/api/src/server.tsNode.js HTTP entry point (Hono + @hono/node-server)
workers/api/src/providers/node-server.tsR2ObjectStorage, CloudflareKvRestStorage, KV namespace shim, factory
workers/api/src/middleware/node-proxy.tsCF Worker middleware that forwards requests
workers/api/src/services/node-proxy.tsProxy logic, health checker, circuit breaker
workers/api/DockerfileDocker image based on ghcr.io/puppeteer/puppeteer:latest
docker-compose.ymlSingle api-node service on port 8790
.env.node-server.exampleTemplate for all required environment variables
workers/api/tsconfig.node.jsonNode-specific TypeScript config (IDE only β€” tsx ignores it)

How Auth Works

The proxy forwards the Authorization header as-is. The Node server’s requireAuth middleware validates the JWT using SUPABASE_JWT_SECRET. Both CF and Node independently verify the token β€” double validation is safe.

KV Namespace Shim

Some routes pass c.env directly and access env.KV_SESSIONS (e.g., extractTablesWithVision in convert.ts). The createKvNamespaceShim() function returns an object matching the CF KVNamespace get/put/delete shape, backed by the REST API. The server’s storage middleware injects this shim into c.env.KV_SESSIONS.

Local Development

Run the Node server locally (requires env vars set):

Terminal window
cd workers/api
PORT=8790 tsx src/server.ts

Or with file watching:

Terminal window
cd workers/api
PORT=8790 npm run server:dev

To test the full proxy flow locally, set NODE_API_URL=http://localhost:8790 in wrangler.toml [vars] and run wrangler dev alongside the Node server.


Deployment

Prerequisites

  • Docker and Docker Compose on 10.1.1.3
  • SSH access: ssh -i ~/.ssh/nightly-audit larry@10.1.1.3
  • Cloudflare Tunnel configured on the server

Step 1: Create R2 S3 API Token

  1. Go to Cloudflare Dashboard > R2 > Manage R2 API Tokens
  2. Click Create API token
  3. Permissions: Object Read & Write
  4. Specify bucket: accessible-pdf-files
  5. Save the Access Key ID and Secret Access Key

Step 2: Create CF API Token for KV

  1. Go to Cloudflare Dashboard > My Profile > API Tokens
  2. Click Create Token
  3. Use Custom token template
  4. Permissions: Account > Workers KV Storage > Edit
  5. Save the token

Step 3: Configure Environment

On 10.1.1.3, in the project directory:

Terminal window
cp .env.node-server.example .env.node-server

Fill in all values:

VariableSource
R2_ACCOUNT_IDCloudflare dashboard > Account ID (sidebar)
R2_ACCESS_KEY_IDFrom Step 1
R2_SECRET_ACCESS_KEYFrom Step 1
R2_BUCKET_NAMEaccessible-pdf-files
CF_ACCOUNT_IDSame as R2_ACCOUNT_ID
CF_API_TOKENFrom Step 2
KV_SESSIONS_NAMESPACE_ID9d39d6e609b945848f5082cea23306b0 (from wrangler.toml)
SUPABASE_URLSupabase project settings
SUPABASE_SERVICE_ROLE_KEYSupabase project settings > API
SUPABASE_JWT_SECRETSupabase project settings > API > JWT Secret
ANTHROPIC_API_KEYAnthropic console
Other API keysAs needed for enabled parsers

Step 4: Build and Start

Terminal window
docker compose up -d --build

Verify it’s running:

Terminal window
docker compose ps
curl http://localhost:8790/health

Expected response:

{
"success": true,
"data": { "status": "healthy", "platform": "node-server", "uptime": 5.123 }
}

Step 5: Configure Cloudflare Tunnel

Add a public hostname rule in the Cloudflare Zero Trust dashboard:

FieldValue
Subdomainnode-pdf (or similar)
Domainanglin.com
Servicehttp://localhost:8790

This gives the Node server a public URL like https://node-pdf.anglin.com.

Step 6: Set NODE_API_URL on CF Worker

Terminal window
wrangler secret put NODE_API_URL --env production
# Enter: https://node-pdf.anglin.com

The CF Worker will now proxy heavy requests to the Node server.

Step 7: Verify End-to-End

  1. Proxy active: Trigger a conversion on pdf.anglin.com. Check CF Worker logs for [node-proxy] Proxying POST /api/convert/... and Node server logs for the incoming request.
  2. Fallback works: docker compose stop, trigger another conversion. CF Worker logs should show [node-proxy] Node server unhealthy β€” falling back to CF.
  3. Recovery: docker compose start, wait ~30s for circuit breaker cooldown, trigger again. Should route through Node server.

Updating

To deploy a new version after code changes:

Terminal window
ssh -i ~/.ssh/nightly-audit larry@10.1.1.3
cd /path/to/accessible-pdf-converter
git pull
docker compose up -d --build

Monitoring

  • Health check: The Dockerfile includes a HEALTHCHECK that hits /health every 30s. Docker will mark the container as unhealthy if it fails 3 times.
  • Logs: docker compose logs -f api-node
  • Uptime Kuma: Add https://node-pdf.anglin.com/health as a monitored endpoint.

Troubleshooting

SymptomLikely CauseFix
Container starts then exitsMissing required env varCheck docker compose logs api-node for Missing required env var: ...
Health check passes but proxy failsCF Tunnel not routingVerify tunnel config in Zero Trust dashboard
KV operations fail with 403API token lacks permissionsRecreate token with Workers KV Storage: Edit
R2 operations fail with 403S3 token wrong bucket scopeRecreate R2 API token scoped to accessible-pdf-files
Puppeteer crashesOut of memoryIncrease container memory limit in docker-compose.yml
Circuit breaker stuck openNode server was down > 30sIt auto-recovers after the 30s cooldown. Force reset by redeploying the CF Worker.