Skip to content

AWS Setup Guide

This guide covers deploying the Accessible PDF Converter on AWS with separate staging and production environments.

Architecture

Frontend (Next.js on CF Pages)
|
v
Hono API (Lambda + API Gateway HTTP API)
|
+--> DynamoDB (sessions, progress, results)
+--> S3 (PDF files, converted HTML)
+--> SQS (pipeline queue)
|
v
EC2 Spot Fleet (ASG, 0-4 instances)
[Docker: Node.js + Puppeteer + Chrome]
| | |
S3 (output) DynamoDB SES (notifications)
SES (email intake) --> Lambda --> SQS --> EC2 workers

Multi-Environment Design

Both staging and production deploy to the same AWS account, isolated by resource name prefixes:

ResourceStagingProduction
Stack prefixAccessiblePdfStaging-*AccessiblePdfProd-*
S3 bucketaccessible-pdf-staging-files-{acct}accessible-pdf-prod-files-{acct}
DynamoDBaccessible-pdf-staging-dataaccessible-pdf-prod-data
SQS queueaccessible-pdf-staging-pipelineaccessible-pdf-prod-pipeline
ECR repoaccessible-pdf-staging-workeraccessible-pdf-prod-worker
Lambdaaccessible-pdf-staging-apiaccessible-pdf-prod-api
Dashboardaccessible-pdf-staging-dashboardaccessible-pdf-prod-dashboard

Environment-specific settings are in infra/cdk/lib/env-config.ts.

Prerequisites

  • AWS CLI configured with credentials (aws configure)
  • Node.js 20+
  • Docker installed
  • CDK CLI (npm install -g aws-cdk)

Step 1: AWS Account Setup

Create Sub-Account (from root account)

  1. Go to AWS Organizations > Create organization (if not already enabled)
  2. Add account > Create an AWS account:
    • Account name: AnglinAI-Accessible
    • Email: larry+aws-accessible@anglin.com
    • IAM role: OrganizationAccountAccessRole (default)
  3. Consolidated billing is automatic β€” costs appear on the root account bill

Create IAM User

  1. Sign into the sub-account via β€œSwitch Role” from root
  2. Create IAM user cdk-deployer with AdministratorAccess
  3. Create access key for CLI use

Configure AWS CLI

Terminal window
aws configure --profile accessible
# Access key, secret, region: us-east-1, output: json

Set Up Cost Tracking

  1. Create a billing alarm at $100/month in CloudWatch
  2. Tag project:accessible-pdf is applied automatically by CDK to all resources

Step 2: Bootstrap CDK

One-time setup per AWS account/region:

Terminal window
AWS_PROFILE=accessible cdk bootstrap aws://ACCOUNT_ID/us-east-1

Step 3: Store API Keys in SSM Parameter Store

API keys and S3-compatible storage credentials are stored in SSM. Per-environment keys go under /accessible-pdf/{environment}/. Shared keys go under /accessible-pdf/shared/.

S3-Compatible Storage Credentials

The API uses any S3-compatible storage provider (R2, AWS S3, MinIO, etc.) configured via SSM:

Terminal window
# For Cloudflare R2:
aws ssm put-parameter --profile accessible \
--name "/accessible-pdf/shared/S3_ENDPOINT" \
--value "https://YOUR_CF_ACCOUNT_ID.r2.cloudflarestorage.com" \
--type "SecureString" \
--region us-east-1
aws ssm put-parameter --profile accessible \
--name "/accessible-pdf/shared/S3_ACCESS_KEY_ID" \
--value "..." \
--type "SecureString" \
--region us-east-1
aws ssm put-parameter --profile accessible \
--name "/accessible-pdf/shared/S3_SECRET_ACCESS_KEY" \
--value "..." \
--type "SecureString" \
--region us-east-1
# For AWS S3: omit S3_ENDPOINT (uses IAM credentials automatically)
# For MinIO: set S3_ENDPOINT=http://minio:9000

API Keys

Terminal window
# Shared keys (used by both staging and production)
aws ssm put-parameter --profile accessible \
--name "/accessible-pdf/shared/ANTHROPIC_API_KEY" \
--value "sk-ant-..." \
--type "SecureString" \
--region us-east-1
aws ssm put-parameter --profile accessible \
--name "/accessible-pdf/shared/GEMINI_API_KEY" \
--value "..." \
--type "SecureString" \
--region us-east-1
aws ssm put-parameter --profile accessible \
--name "/accessible-pdf/shared/MISTRAL_API_KEY" \
--value "..." \
--type "SecureString" \
--region us-east-1
aws ssm put-parameter --profile accessible \
--name "/accessible-pdf/shared/MARKER_API_KEY" \
--value "..." \
--type "SecureString" \
--region us-east-1
aws ssm put-parameter --profile accessible \
--name "/accessible-pdf/shared/MATHPIX_APP_ID" \
--value "..." \
--type "SecureString" \
--region us-east-1
aws ssm put-parameter --profile accessible \
--name "/accessible-pdf/shared/MATHPIX_APP_KEY" \
--value "..." \
--type "SecureString" \
--region us-east-1
# Environment-specific keys (if needed)
aws ssm put-parameter --profile accessible \
--name "/accessible-pdf/production/SUPABASE_JWT_SECRET" \
--value "..." \
--type "SecureString" \
--region us-east-1
aws ssm put-parameter --profile accessible \
--name "/accessible-pdf/staging/SUPABASE_JWT_SECRET" \
--value "..." \
--type "SecureString" \
--region us-east-1

Step 4: Deploy CDK Stacks

Deploy Staging

Terminal window
cd infra/cdk
npm install
DEPLOY_ENV=staging \
CDK_DEFAULT_ACCOUNT=YOUR_ACCOUNT_ID \
CDK_DEFAULT_REGION=us-east-1 \
AWS_PROFILE=accessible \
cdk deploy --all

Deploy Production

Terminal window
DEPLOY_ENV=production \
CDK_DEFAULT_ACCOUNT=YOUR_ACCOUNT_ID \
CDK_DEFAULT_REGION=us-east-1 \
AWS_PROFILE=accessible \
cdk deploy --all

Preview Changes Before Deploying

Terminal window
DEPLOY_ENV=staging cdk diff --all

This creates 7 stacks per environment:

StackResources
*-NetworkVPC, 2 public subnets, security group
*-StorageDynamoDB table (file storage is external β€” R2/S3/MinIO via SSM config)
*-QueueSQS queue, DLQ
*-ComputeECR repo, launch template, Auto Scaling Group
*-ApiLambda function, HTTP API Gateway
*-EmailSES receipt rule, email S3 bucket, email Lambda
*-MonitoringCloudWatch dashboard, alarms, SNS topic

Step 5: Build and Push Worker Docker Image

Terminal window
# From project root
npm install
npm run build --workspace=packages/shared
npm run build --workspace=workers/batch
# Get ECR URI from stack output (use Staging or Prod prefix)
ECR_URI=$(aws cloudformation describe-stacks \
--stack-name AccessiblePdfStaging-Compute \
--query 'Stacks[0].Outputs[?OutputKey==`EcrRepoUri`].OutputValue' \
--output text --profile accessible)
# Login to ECR
aws ecr get-login-password --region us-east-1 --profile accessible | \
docker login --username AWS --password-stdin $ECR_URI
# Build and push
docker build -f infra/cdk/docker/worker/Dockerfile -t $ECR_URI:latest .
docker push $ECR_URI:latest

Step 6: Build and Deploy Lambda Functions

API Lambda

Terminal window
cd workers/api
npm run build # outputs to dist-lambda/
cd ../../infra/cdk
DEPLOY_ENV=staging cdk deploy AccessiblePdfStaging-Api

Email Intake Lambda

Terminal window
cd workers/email-intake
npm run build # outputs to dist/
cd ../../infra/cdk
DEPLOY_ENV=staging cdk deploy AccessiblePdfStaging-Email

Step 7: Configure SES for Email Intake

Verify Domain

Terminal window
aws ses verify-domain-identity --domain pdf.theaccessible.org --region us-east-1 --profile accessible

Add DNS Records

MX record (for receiving):

pdf.theaccessible.org. MX 10 inbound-smtp.us-east-1.amazonaws.com.

SPF record (for sending):

pdf.theaccessible.org. TXT "v=spf1 include:amazonses.com ~all"

DKIM records: 3 CNAME records generated by SES after domain verification.

Request Production Access

SES starts in sandbox mode. Request production access in the SES console.

Step 8: Custom Domain for API Gateway

  1. Request ACM certificate for api-pdf.theaccessible.org in us-east-1
  2. Add custom domain mapping in API Gateway console
  3. Create CNAME in Cloudflare DNS pointing to the API Gateway domain
  4. Cloudflare continues to proxy (DDoS protection + AWS compute)

CI/CD Pipeline

Automatic Flow

Push to main β†’ Tests β†’ Auto-deploy to staging β†’ Smoke test
Manual trigger β†’ Tests β†’ Deploy to production β†’ Smoke test

GitHub Secrets Required

SecretDescription
AWS_ACCESS_KEY_IDIAM user access key
AWS_SECRET_ACCESS_KEYIAM user secret key
AWS_ACCOUNT_IDAWS account ID
PACKAGES_TOKENGitHub Packages token (for @anglinai/ui)

GitHub Environments

Create two environments in GitHub repo settings:

  • staging: No protection rules (auto-deploys)
  • production: Add required reviewers or manual approval

Manual Production Deploy

Go to Actions > Deploy AWS > Run workflow > Select β€œproduction”

Day-to-Day Workflow

1. Create feature branch from main
2. Make changes
3. Push branch, create PR
4. PR merge β†’ auto-deploys to staging
5. Test on staging
6. Manual promote to production (GitHub Actions)

Rollback

  • Lambda: Redeploy previous commit via GitHub Actions
  • EC2 Workers: Push previous Docker image tag, trigger ASG refresh
  • Infrastructure: cdk deploy with reverted code

Auto Scaling Behavior

Queue DepthStagingProduction
0 messages0 instances0 instances
1+ messages1 instance+1 instance
10+ messages1 (max)+2 instances
50+ messages1 (max)4 (max)

Cooldown: 5 minutes. Cold start: ~2-3 minutes (instance launch + Docker pull).

Monitoring

CloudWatch Dashboard

Each environment gets its own dashboard: accessible-pdf-staging-dashboard / accessible-pdf-prod-dashboard

Widgets: queue depth, DLQ messages, instance count, Lambda metrics.

Alarms (Production Only)

AlarmConditionAction
DLQ messages>= 1 messageEmail larry@anglin.com
Queue backlog> 50 for 15 minEmail larry@anglin.com
API 5xx> 5 errors in 5 minEmail larry@anglin.com

Staging has the same alarms but no email subscription.

Cost Estimate

ResourceStagingProductionNotes
LambdaFree tier~$5/mo1M free requests/month
API GatewayFree tier~$3/mo1M free requests/month
EC2 Spot~$5/mo~$20-50/moScales to 0 when idle
S3<$1~$5/moDepends on volume
DynamoDBFree tier~$5/moOn-demand, 25 GB free
SQSFree tier<$11M free requests/month
CloudWatchFree tier~$3/moDashboard + alarms
Total~$5-10/mo~$35-70/moAt 5-10K docs/month

Environment Variables Reference

Batch Worker (set by EC2 launch template)

VariableDescription
AWS_REGIONAWS region
ENVIRONMENTstaging or production
S3_BUCKETS3 bucket name
DYNAMODB_TABLEDynamoDB table name
SQS_QUEUE_URLSQS queue URL
SQS_DLQ_URLSQS DLQ URL
SSM_PREFIXSSM parameter prefix for API keys

API Lambda (set by CDK)

VariableDescription
NODE_ENVstaging or production
ENVIRONMENTstaging or production
S3_BUCKETS3 bucket name
DYNAMODB_TABLEDynamoDB table name
SQS_QUEUE_URLSQS queue URL
FRONTEND_URLFrontend URL for CORS
SSM_PREFIXSSM parameter prefix

Email Intake Lambda (set by CDK)

VariableDescription
NODE_ENVstaging or production
S3_BUCKETS3 bucket name
DYNAMODB_TABLEDynamoDB table name
SQS_QUEUE_URLSQS queue URL
FROM_EMAILReply-from address
FRONTEND_URLFrontend URL for result links