Data Privacy Features - Implementation Plan
This document is the implementation index for the privacy features currently described as planned or coming soon on the marketing site. It exists so product, engineering, and customer-facing teams have one place to track what must be built before those claims become productized.
Features Covered
- Customer-managed storage via S3-compatible credentials entered in Settings
- Customer-managed LLM keys for supported providers, including at least one vision-capable model
- Customer-managed AWS deployment using our CDK stacks and a coordination layer
- Customer-managed Docker deployment packaging the stack as an installable self-hosted product
Deliverable Documents
docs/admin/s3-customer-storage-implementation-plan.mddocs/admin/llm-bring-your-own-keys-plan.mddocs/admin/customer-aws-stack-implementation-plan.mddocs/admin/docker-self-hosted-implementation-plan.md
Cross-Cutting Principles
All four features should follow the same operating principles:
- Privacy-sensitive configuration must be tenant-scoped and encrypted at rest.
- The UI must clearly distinguish hosted default behavior from customer-managed behavior.
- Failures must be explicit. If a customer-managed dependency is misconfigured, we should fail closed with actionable errors instead of silently falling back to hosted infrastructure.
- Auditability matters. Every sensitive configuration change should be attributable to a user, timestamp, and tenant.
- The product must continue to support the current hosted default path for customers who do not opt into these features.
Shared Platform Work
These items are needed across multiple plans:
1. Tenant capability flags
We need feature flags or entitlements at the tenant level so we can selectively enable:
- customer storage
- bring-your-own AI keys
- dedicated AWS deployment
- self-hosted / Docker distribution
2. Secrets management model
We need one consistent pattern for secrets across hosted and customer-dedicated deployments:
- hosted SaaS: encrypted secret records in our database or external secret manager
- customer AWS: AWS Secrets Manager or SSM Parameter Store
- Docker/on-prem: local encrypted secret file or mounted secret store
3. Provider health-check framework
We need reusable health-checks for external dependencies:
- storage bucket write/read/delete validation
- AI provider auth test
- coordination-layer heartbeat
- worker health and queue connectivity
4. Admin and audit surfaces
We need a common admin pattern for:
- current status
- last validated time
- last error
- who changed configuration
- whether hosted fallback is disabled
Suggested Delivery Order
-
S3 customer storage This is the smallest privacy-control step and unblocks stronger βwe do not retain documents after processingβ stories.
-
Bring your own LLM keys This is the next smallest hosted-SaaS control and can share the same settings, validation, and secret-management patterns.
-
Customer AWS stack This is the main enterprise privacy offering. It requires the most platform rigor and should be built after the hosted control surfaces exist.
-
Docker self-hosted product This is effectively productizing the self-hosted path. It should reuse as much AWS and on-prem work as possible.
Exit Criteria For Marketing Claims
We should not move a feature from βplannedβ to βavailableβ until all of the following are true:
- the feature is usable by at least one real tenant without engineering intervention
- there is admin documentation and a support runbook
- error handling is understandable from the product UI
- the feature is covered by automated tests plus one end-to-end validation path
- support and sales know the limitations and prerequisites