AWS Setup Guide
This guide covers deploying the Accessible PDF Converter on AWS with separate staging and production environments.
Architecture
Frontend (Next.js on CF Pages) | vHono API (Lambda + API Gateway HTTP API) | +--> DynamoDB (sessions, progress, results) +--> S3 (PDF files, converted HTML) +--> SQS (pipeline queue) | v EC2 Spot Fleet (ASG, 0-4 instances) [Docker: Node.js + Puppeteer + Chrome] | | | S3 (output) DynamoDB SES (notifications)
SES (email intake) --> Lambda --> SQS --> EC2 workersMulti-Environment Design
Both staging and production deploy to the same AWS account, isolated by resource name prefixes:
| Resource | Staging | Production |
|---|---|---|
| Stack prefix | AccessiblePdfStaging-* | AccessiblePdfProd-* |
| S3 bucket | accessible-pdf-staging-files-{acct} | accessible-pdf-prod-files-{acct} |
| DynamoDB | accessible-pdf-staging-data | accessible-pdf-prod-data |
| SQS queue | accessible-pdf-staging-pipeline | accessible-pdf-prod-pipeline |
| ECR repo | accessible-pdf-staging-worker | accessible-pdf-prod-worker |
| Lambda | accessible-pdf-staging-api | accessible-pdf-prod-api |
| Dashboard | accessible-pdf-staging-dashboard | accessible-pdf-prod-dashboard |
Environment-specific settings are in infra/cdk/lib/env-config.ts.
Prerequisites
- AWS CLI configured with credentials (
aws configure) - Node.js 20+
- Docker installed
- CDK CLI (
npm install -g aws-cdk)
Step 1: AWS Account Setup
Create Sub-Account (from root account)
- Go to AWS Organizations > Create organization (if not already enabled)
- Add account > Create an AWS account:
- Account name:
AnglinAI-Accessible - Email:
larry+aws-accessible@anglin.com - IAM role:
OrganizationAccountAccessRole(default)
- Account name:
- Consolidated billing is automatic β costs appear on the root account bill
Create IAM User
- Sign into the sub-account via βSwitch Roleβ from root
- Create IAM user
cdk-deployerwithAdministratorAccess - Create access key for CLI use
Configure AWS CLI
aws configure --profile accessible# Access key, secret, region: us-east-1, output: jsonSet Up Cost Tracking
- Create a billing alarm at $100/month in CloudWatch
- Tag
project:accessible-pdfis applied automatically by CDK to all resources
Step 2: Bootstrap CDK
One-time setup per AWS account/region:
AWS_PROFILE=accessible cdk bootstrap aws://ACCOUNT_ID/us-east-1Step 3: Store API Keys in SSM Parameter Store
API keys and S3-compatible storage credentials are stored in SSM. Per-environment keys go under /accessible-pdf/{environment}/. Shared keys go under /accessible-pdf/shared/.
S3-Compatible Storage Credentials
The API uses any S3-compatible storage provider (R2, AWS S3, MinIO, etc.) configured via SSM:
# For Cloudflare R2:aws ssm put-parameter --profile accessible \ --name "/accessible-pdf/shared/S3_ENDPOINT" \ --value "https://YOUR_CF_ACCOUNT_ID.r2.cloudflarestorage.com" \ --type "SecureString" \ --region us-east-1
aws ssm put-parameter --profile accessible \ --name "/accessible-pdf/shared/S3_ACCESS_KEY_ID" \ --value "..." \ --type "SecureString" \ --region us-east-1
aws ssm put-parameter --profile accessible \ --name "/accessible-pdf/shared/S3_SECRET_ACCESS_KEY" \ --value "..." \ --type "SecureString" \ --region us-east-1
# For AWS S3: omit S3_ENDPOINT (uses IAM credentials automatically)# For MinIO: set S3_ENDPOINT=http://minio:9000API Keys
# Shared keys (used by both staging and production)aws ssm put-parameter --profile accessible \ --name "/accessible-pdf/shared/ANTHROPIC_API_KEY" \ --value "sk-ant-..." \ --type "SecureString" \ --region us-east-1
aws ssm put-parameter --profile accessible \ --name "/accessible-pdf/shared/GEMINI_API_KEY" \ --value "..." \ --type "SecureString" \ --region us-east-1
aws ssm put-parameter --profile accessible \ --name "/accessible-pdf/shared/MISTRAL_API_KEY" \ --value "..." \ --type "SecureString" \ --region us-east-1
aws ssm put-parameter --profile accessible \ --name "/accessible-pdf/shared/MARKER_API_KEY" \ --value "..." \ --type "SecureString" \ --region us-east-1
aws ssm put-parameter --profile accessible \ --name "/accessible-pdf/shared/MATHPIX_APP_ID" \ --value "..." \ --type "SecureString" \ --region us-east-1
aws ssm put-parameter --profile accessible \ --name "/accessible-pdf/shared/MATHPIX_APP_KEY" \ --value "..." \ --type "SecureString" \ --region us-east-1
# Environment-specific keys (if needed)aws ssm put-parameter --profile accessible \ --name "/accessible-pdf/production/SUPABASE_JWT_SECRET" \ --value "..." \ --type "SecureString" \ --region us-east-1
aws ssm put-parameter --profile accessible \ --name "/accessible-pdf/staging/SUPABASE_JWT_SECRET" \ --value "..." \ --type "SecureString" \ --region us-east-1Step 4: Deploy CDK Stacks
Deploy Staging
cd infra/cdknpm install
DEPLOY_ENV=staging \CDK_DEFAULT_ACCOUNT=YOUR_ACCOUNT_ID \CDK_DEFAULT_REGION=us-east-1 \AWS_PROFILE=accessible \cdk deploy --allDeploy Production
DEPLOY_ENV=production \CDK_DEFAULT_ACCOUNT=YOUR_ACCOUNT_ID \CDK_DEFAULT_REGION=us-east-1 \AWS_PROFILE=accessible \cdk deploy --allPreview Changes Before Deploying
DEPLOY_ENV=staging cdk diff --allThis creates 7 stacks per environment:
| Stack | Resources |
|---|---|
*-Network | VPC, 2 public subnets, security group |
*-Storage | DynamoDB table (file storage is external β R2/S3/MinIO via SSM config) |
*-Queue | SQS queue, DLQ |
*-Compute | ECR repo, launch template, Auto Scaling Group |
*-Api | Lambda function, HTTP API Gateway |
*-Email | SES receipt rule, email S3 bucket, email Lambda |
*-Monitoring | CloudWatch dashboard, alarms, SNS topic |
Step 5: Build and Push Worker Docker Image
# From project rootnpm installnpm run build --workspace=packages/sharednpm run build --workspace=workers/batch
# Get ECR URI from stack output (use Staging or Prod prefix)ECR_URI=$(aws cloudformation describe-stacks \ --stack-name AccessiblePdfStaging-Compute \ --query 'Stacks[0].Outputs[?OutputKey==`EcrRepoUri`].OutputValue' \ --output text --profile accessible)
# Login to ECRaws ecr get-login-password --region us-east-1 --profile accessible | \ docker login --username AWS --password-stdin $ECR_URI
# Build and pushdocker build -f infra/cdk/docker/worker/Dockerfile -t $ECR_URI:latest .docker push $ECR_URI:latestStep 6: Build and Deploy Lambda Functions
API Lambda
cd workers/apinpm run build # outputs to dist-lambda/cd ../../infra/cdkDEPLOY_ENV=staging cdk deploy AccessiblePdfStaging-ApiEmail Intake Lambda
cd workers/email-intakenpm run build # outputs to dist/cd ../../infra/cdkDEPLOY_ENV=staging cdk deploy AccessiblePdfStaging-EmailStep 7: Configure SES for Email Intake
Verify Domain
aws ses verify-domain-identity --domain pdf.theaccessible.org --region us-east-1 --profile accessibleAdd DNS Records
MX record (for receiving):
pdf.theaccessible.org. MX 10 inbound-smtp.us-east-1.amazonaws.com.SPF record (for sending):
pdf.theaccessible.org. TXT "v=spf1 include:amazonses.com ~all"DKIM records: 3 CNAME records generated by SES after domain verification.
Request Production Access
SES starts in sandbox mode. Request production access in the SES console.
Step 8: Custom Domain for API Gateway
- Request ACM certificate for
api-pdf.theaccessible.orgin us-east-1 - Add custom domain mapping in API Gateway console
- Create CNAME in Cloudflare DNS pointing to the API Gateway domain
- Cloudflare continues to proxy (DDoS protection + AWS compute)
CI/CD Pipeline
Automatic Flow
Push to main β Tests β Auto-deploy to staging β Smoke testManual trigger β Tests β Deploy to production β Smoke testGitHub Secrets Required
| Secret | Description |
|---|---|
AWS_ACCESS_KEY_ID | IAM user access key |
AWS_SECRET_ACCESS_KEY | IAM user secret key |
AWS_ACCOUNT_ID | AWS account ID |
PACKAGES_TOKEN | GitHub Packages token (for @anglinai/ui) |
GitHub Environments
Create two environments in GitHub repo settings:
- staging: No protection rules (auto-deploys)
- production: Add required reviewers or manual approval
Manual Production Deploy
Go to Actions > Deploy AWS > Run workflow > Select βproductionβ
Day-to-Day Workflow
1. Create feature branch from main2. Make changes3. Push branch, create PR4. PR merge β auto-deploys to staging5. Test on staging6. Manual promote to production (GitHub Actions)Rollback
- Lambda: Redeploy previous commit via GitHub Actions
- EC2 Workers: Push previous Docker image tag, trigger ASG refresh
- Infrastructure:
cdk deploywith reverted code
Auto Scaling Behavior
| Queue Depth | Staging | Production |
|---|---|---|
| 0 messages | 0 instances | 0 instances |
| 1+ messages | 1 instance | +1 instance |
| 10+ messages | 1 (max) | +2 instances |
| 50+ messages | 1 (max) | 4 (max) |
Cooldown: 5 minutes. Cold start: ~2-3 minutes (instance launch + Docker pull).
Monitoring
CloudWatch Dashboard
Each environment gets its own dashboard: accessible-pdf-staging-dashboard / accessible-pdf-prod-dashboard
Widgets: queue depth, DLQ messages, instance count, Lambda metrics.
Alarms (Production Only)
| Alarm | Condition | Action |
|---|---|---|
| DLQ messages | >= 1 message | Email larry@anglin.com |
| Queue backlog | > 50 for 15 min | Email larry@anglin.com |
| API 5xx | > 5 errors in 5 min | Email larry@anglin.com |
Staging has the same alarms but no email subscription.
Cost Estimate
| Resource | Staging | Production | Notes |
|---|---|---|---|
| Lambda | Free tier | ~$5/mo | 1M free requests/month |
| API Gateway | Free tier | ~$3/mo | 1M free requests/month |
| EC2 Spot | ~$5/mo | ~$20-50/mo | Scales to 0 when idle |
| S3 | <$1 | ~$5/mo | Depends on volume |
| DynamoDB | Free tier | ~$5/mo | On-demand, 25 GB free |
| SQS | Free tier | <$1 | 1M free requests/month |
| CloudWatch | Free tier | ~$3/mo | Dashboard + alarms |
| Total | ~$5-10/mo | ~$35-70/mo | At 5-10K docs/month |
Environment Variables Reference
Batch Worker (set by EC2 launch template)
| Variable | Description |
|---|---|
AWS_REGION | AWS region |
ENVIRONMENT | staging or production |
S3_BUCKET | S3 bucket name |
DYNAMODB_TABLE | DynamoDB table name |
SQS_QUEUE_URL | SQS queue URL |
SQS_DLQ_URL | SQS DLQ URL |
SSM_PREFIX | SSM parameter prefix for API keys |
API Lambda (set by CDK)
| Variable | Description |
|---|---|
NODE_ENV | staging or production |
ENVIRONMENT | staging or production |
S3_BUCKET | S3 bucket name |
DYNAMODB_TABLE | DynamoDB table name |
SQS_QUEUE_URL | SQS queue URL |
FRONTEND_URL | Frontend URL for CORS |
SSM_PREFIX | SSM parameter prefix |
Email Intake Lambda (set by CDK)
| Variable | Description |
|---|---|
NODE_ENV | staging or production |
S3_BUCKET | S3 bucket name |
DYNAMODB_TABLE | DynamoDB table name |
SQS_QUEUE_URL | SQS queue URL |
FROM_EMAIL | Reply-from address |
FRONTEND_URL | Frontend URL for result links |