Skip to content

Data Privacy Features - Implementation Plan

This document is the implementation index for the privacy features currently described as planned or coming soon on the marketing site. It exists so product, engineering, and customer-facing teams have one place to track what must be built before those claims become productized.

Features Covered

  1. Customer-managed storage via S3-compatible credentials entered in Settings
  2. Customer-managed LLM keys for supported providers, including at least one vision-capable model
  3. Customer-managed AWS deployment using our CDK stacks and a coordination layer
  4. Customer-managed Docker deployment packaging the stack as an installable self-hosted product

Deliverable Documents

  • docs/admin/s3-customer-storage-implementation-plan.md
  • docs/admin/llm-bring-your-own-keys-plan.md
  • docs/admin/customer-aws-stack-implementation-plan.md
  • docs/admin/docker-self-hosted-implementation-plan.md

Cross-Cutting Principles

All four features should follow the same operating principles:

  • Privacy-sensitive configuration must be tenant-scoped and encrypted at rest.
  • The UI must clearly distinguish hosted default behavior from customer-managed behavior.
  • Failures must be explicit. If a customer-managed dependency is misconfigured, we should fail closed with actionable errors instead of silently falling back to hosted infrastructure.
  • Auditability matters. Every sensitive configuration change should be attributable to a user, timestamp, and tenant.
  • The product must continue to support the current hosted default path for customers who do not opt into these features.

Shared Platform Work

These items are needed across multiple plans:

1. Tenant capability flags

We need feature flags or entitlements at the tenant level so we can selectively enable:

  • customer storage
  • bring-your-own AI keys
  • dedicated AWS deployment
  • self-hosted / Docker distribution

2. Secrets management model

We need one consistent pattern for secrets across hosted and customer-dedicated deployments:

  • hosted SaaS: encrypted secret records in our database or external secret manager
  • customer AWS: AWS Secrets Manager or SSM Parameter Store
  • Docker/on-prem: local encrypted secret file or mounted secret store

3. Provider health-check framework

We need reusable health-checks for external dependencies:

  • storage bucket write/read/delete validation
  • AI provider auth test
  • coordination-layer heartbeat
  • worker health and queue connectivity

4. Admin and audit surfaces

We need a common admin pattern for:

  • current status
  • last validated time
  • last error
  • who changed configuration
  • whether hosted fallback is disabled

Suggested Delivery Order

  1. S3 customer storage This is the smallest privacy-control step and unblocks stronger β€œwe do not retain documents after processing” stories.

  2. Bring your own LLM keys This is the next smallest hosted-SaaS control and can share the same settings, validation, and secret-management patterns.

  3. Customer AWS stack This is the main enterprise privacy offering. It requires the most platform rigor and should be built after the hosted control surfaces exist.

  4. Docker self-hosted product This is effectively productizing the self-hosted path. It should reuse as much AWS and on-prem work as possible.

Exit Criteria For Marketing Claims

We should not move a feature from β€œplanned” to β€œavailable” until all of the following are true:

  • the feature is usable by at least one real tenant without engineering intervention
  • there is admin documentation and a support runbook
  • error handling is understandable from the product UI
  • the feature is covered by automated tests plus one end-to-end validation path
  • support and sales know the limitations and prerequisites