Customer-Dedicated AWS Deployment
Overview
This document assesses the feasibility of deploying a dedicated instance of The Accessible on a customerβs own AWS account, including AWS GovCloud. The goal is to understand what work is required, whatβs already portable, and what the major obstacles are.
Current Architecture
The application runs across two platforms:
- Cloudflare β static frontend (Pages), API (Workers), object storage (R2), KV (sessions/rate limiting), browser rendering
- AWS β Lambda API (failover), EC2 worker fleet (ASG), SQS queues, DynamoDB, SES email intake, CloudWatch monitoring
- Supabase β PostgreSQL database, authentication (magic link + Google OAuth)
- Third-party SaaS β AI/OCR services (Claude, Gemini, OpenAI, Mistral, Mathpix, Marker), Stripe payments, Resend email
Portability by Layer
Already Portable (Low Effort)
| Layer | Current | Customer AWS | Notes |
|---|---|---|---|
| Frontend | Cloudflare Pages | S3 + CloudFront | Static Next.js export, deploy anywhere |
| API | CF Workers + Lambda | Lambda + API Gateway | CDK stacks already exist in infra/cdk/ |
| Object Storage | Cloudflare R2 | AWS S3 | Uses S3Client with configurable endpoint (workers/api/src/providers/node-server.ts) |
| KV Store | Cloudflare KV | DynamoDB | Already built in CDK, KV abstraction exists in providers |
| Queue | CF cron + SQS | SQS | Already in CDK |
| PDF Processing | WeasyPrint + Audiveris (Docker) | Same containers on EC2/ECS | Dockerfiles exist, ECR push in CI |
| Browser Rendering | CF Browser Rendering | Puppeteer on EC2 | Node.js server mode already uses native Puppeteer |
| Monitoring | Grafana + Loki | Same stack or CloudWatch | Docker Compose available |
| Resend + SES | AWS SES | SES email intake already built in CDK email stack | |
| Payments | Stripe | Customerβs Stripe account | Just different API keys |
| AI Services | Claude, Gemini, OpenAI, etc. | Same APIs | All configured via env vars/API keys |
Requires Significant Work
Database: Supabase β RDS PostgreSQL (Medium)
The 54 SQL migrations in supabase/migrations/ are standard PostgreSQL and will run on RDS. However:
- Some migrations reference Supabase-specific schemas (
auth.uid(),auth.jwt()in RLS policies,auth.userstable joins) - The API uses
@supabase/supabase-jsfor all database queries across ~30+ route files - Query patterns are mostly
.from('table').select().eq()which translate straightforwardly to any query builder
Work needed:
- Strip or replace
auth.*references in RLS policies - Move authorization to the API layer (simpler than reimplementing RLS with Cognito)
- Replace
@supabase/supabase-jsDB calls with a direct Postgres client (e.g.,pg,drizzle-orm, orkysely) - Replace
auth.usersforeign keys with a standaloneuserstable
Authentication: Supabase Auth β AWS Cognito (High)
This is the single largest piece of work. Supabase Auth is deeply integrated:
- Frontend β
supabase.auth.signInWithOtp(),signInWithOAuth(),getSession(),onAuthStateChange()in auth-context for web, music, and forms apps - API middleware (
workers/api/src/middleware/auth.ts) β JWT verification via Supabase JWKS endpoint - Per-user DB clients (
workers/api/src/utils/supabase.ts) β creates Supabase clients with user JWT for RLS
Work needed:
- Replace Supabase auth calls with Cognito SDK or Amplify Auth in 3+ frontend apps
- Replace magic link flow with Cognito custom auth challenge or hosted UI
- Replace Google OAuth with Cognito identity provider federation
- Rewrite JWT verification middleware for Cognito tokens
- Map Cognito
subto the appβs user ID system
Third-Party AI/OCR Service Concerns
Security-conscious or government customers may object to sending document data to third-party SaaS. This is relevant for:
Mathpix & Marker β Most Likely Objections
- Mathpix sends documents to
api.mathpix.comfor math/equation OCR. No self-hosted option. Best-in-class for LaTeX extraction from scanned PDFs. - Marker API sends documents to Datalabβs servers for document parsing.
- Both are already optional. The converter cascade skips them if no API key is configured. The system degrades gracefully β equation-heavy documents wonβt convert as well.
- Fallback: Google Cloud Vision OCR (via Vertex AI) or self-hosted Tesseract. Lower quality for math content.
AI LLM APIs β Possible Objections
The core conversion pipeline requires at least one vision LLM. Options by data sensitivity:
- Gemini via Vertex AI (most government-friendly) β runs in customerβs own GCP project, data stays within their boundary
- Claude / OpenAI / Mistral β commercial SaaS, documents leave the customerβs infrastructure
- Self-hosted open-source models (e.g., LLaVA, Qwen-VL on EC2 GPU) β data stays in VPC but quality drops significantly and cost increases
Minimum viable for restricted environments: Gemini on Vertex AI only.
GovCloud-Specific Considerations
- AI API access β GovCloud VPCs may restrict outbound to commercial AI APIs. This could be a blocker if even Google Vertex AI is disallowed. FedRAMP-authorized alternatives may be required.
- AWS partition β GovCloud uses
aws-us-govpartition. CDK handles this viaStack.of(this).partition. - Cognito β Available in GovCloud (us-gov-west-1).
- SES β Available in GovCloud with restrictions.
- Stripe β May need a government-approved payment processor instead.
- Docker images β ECR works in GovCloud, but pulling base images from Docker Hub may be restricted. Pre-build and push to customer ECR.
- Data residency β Self-hosted services (WeasyPrint, Audiveris) run in the customerβs VPC, so no concern there.
Effort Estimate
| Work Item | Effort | Notes |
|---|---|---|
| CDK env config for customer | 1-2 days | New entry in infra/cdk/lib/env-config.ts |
| Deploy CDK stacks | 1 day | cdk deploy --all with customer credentials |
| RDS setup + migrations | 1-2 days | Strip auth.* references, test schema |
| Auth: Cognito (API side) | 3-5 days | New middleware, JWT verification, user management |
| Auth: Cognito (Frontend) | 3-5 days | Replace Supabase auth in 3+ apps |
| Replace Supabase DB client | 5-8 days | ~30+ route files, query translation, connection pooling |
| Frontend deploy (S3+CloudFront) | 1 day | Build, upload, configure distribution |
| SES domain setup | 0.5 days | DKIM/SPF verification |
| Hardcoded URL cleanup | 1-2 days | Parameterize theaccessible.org, CORS config |
| AI key configuration (SSM) | 0.5 days | Provision parameters in customer account |
| Testing & validation | 3-5 days | End-to-end testing of all flows |
| Total | ~20-30 days | One developer familiar with the codebase |
Recommended Approach
Option A: Abstraction Layer (Recommended if >1 customer)
Build provider interfaces that let the same codebase run on Supabase or pure AWS:
- Database provider β
DatabaseClientinterface withSupabaseClientandPostgresClientimplementations, selected byDB_PROVIDERenv var (~1 week) - Auth provider β
AuthProviderinterface withSupabaseAuthandCognitoAuthimplementations, selected byAUTH_PROVIDERenv var (~1 week) - Environment template β customer-aws config in CDK, deployment runbook, SSM setup script (~2-3 days)
Total: ~3-4 weeks, then ~2-3 days per new customer instance.
Option B: One-Off Fork (If only 1 customer)
Fork the codebase and replace Supabase directly with Cognito + RDS.
Total: ~2-3 weeks, but each customer gets a divergent codebase.
Key Files
| Purpose | Path |
|---|---|
| CDK stack orchestration | infra/cdk/bin/app.ts |
| Environment config | infra/cdk/lib/env-config.ts |
| CDK stacks | infra/cdk/lib/stacks/*.ts |
| AWS CI/CD | .github/workflows/deploy-aws.yml |
| S3/storage abstraction | workers/api/src/providers/node-server.ts |
| Auth middleware | workers/api/src/middleware/auth.ts |
| Supabase client | workers/api/src/utils/supabase.ts |
| Frontend auth | apps/web/src/lib/auth-context.tsx |
| Frontend API layer | apps/web/src/lib/api.ts |
| Database migrations | supabase/migrations/ |
| Docker Compose | docker-compose.yml |
| Env var templates | .env.local.example, .env.node-server.example |