Bring Your Own LLM Keys - Implementation Plan
Goal
Allow a tenant to provide its own AI provider keys so document and image analysis runs against customer-controlled model accounts instead of our shared hosted accounts.
Minimum Product Promise
If a tenant provides at least one supported vision-capable model provider, the core conversion path remains usable without using our hosted AI credentials.
Current State
The stack already supports multiple AI providers by environment variable. The missing work is per-tenant configuration, routing, validation, fallback policy, and observability.
Supported Provider Model
Phase 1 should focus on providers that already exist in the codebase and have stable API usage patterns:
- Google Gemini / Vertex AI
- Anthropic
- OpenAI, if currently supported in active paths
Later phases can add:
- Mistral
- Azure OpenAI
- self-hosted OpenAI-compatible endpoints
Product Design
Settings Surface
Add a new Settings section: AI Providers.
For each provider:
- enabled toggle
- provider type
- API key or service-account credentials
- endpoint override if applicable
- model selection
- vision capability indicator
- optional rate-limit cap
Tenant-level controls:
- default provider order
- disable hosted fallback
- allow hosted fallback
Status panel:
- last validation result
- tested models
- vision-capable providers available
- current effective routing order
Functional Requirements
- A tenant can store multiple providers.
- Routing chooses the first healthy provider that satisfies the task.
- Tasks requiring vision must only use providers marked vision-capable.
- If hosted fallback is disabled and no healthy provider exists, the job fails immediately with a clear tenant-facing error.
Technical Plan
Phase 1 - Tenant Provider Schema
Add tenant-scoped AI provider records with:
tenant_idproviderlabelcredentials_encryptedendpointdefault_modelsupports_visionis_enabledpriorityvalidation_statusvalidation_errorvalidated_atcreated_byupdated_by
Also add tenant-level policy:
allow_hosted_ai_fallbackrequire_customer_managed_ai
Phase 2 - Secret Handling
- Encrypt provider credentials at rest.
- Support one-time input and redaction after save.
- Support key rotation without breaking unrelated providers.
Phase 3 - Validation Service
Each configured provider needs a health test that verifies:
- credential validity
- requested model availability
- vision support if the tenant expects it
- quota / permission sanity
The validation path should not send customer documents. It should use a synthetic health-check prompt or provider metadata endpoint.
Phase 4 - Runtime Provider Resolution
Build a tenant-aware provider selection layer:
- load tenant provider policy
- determine task requirements:
- text only
- vision required
- high-context / long-output path
- choose the highest-priority healthy provider
- record which provider actually handled the request
Persist provider choice on the job record for supportability.
Phase 5 - Failure and Fallback Behavior
Explicitly define:
- whether retries stay on the same provider or fail over
- when hosted fallback is allowed
- what user-facing error appears when no compliant provider is available
Recommended default:
- retry once on same provider for transient failures
- fail over to next healthy tenant provider
- never use hosted fallback unless tenant policy permits it
Phase 6 - Billing and Analytics
If customers bring their own keys, we need a billing policy decision:
- do we still charge for conversion only
- do we discount hosted AI costs
- do we expose provider usage counters in the UI
Even if pricing does not change immediately, we need internal analytics for:
- jobs by provider
- failures by provider
- average latency by provider
Backend Work Items
- Add tenant AI provider schema and migrations
- Add encrypted credential storage
- Add provider validation service
- Add provider selection abstraction
- Refactor current env-based provider resolution to allow tenant override
- Add job-level provider attribution
- Add audit events for provider config changes
Frontend Work Items
- Add
AI Providerssettings UI - Add priority ordering UI
- Add validation, enable/disable, and rotation flows
- Add provider-health status
Documentation Work Items
- Customer guide for supported providers and required permissions
- Support runbook for common validation errors
- Security note describing how provider data is used and not retained
Dependencies
- secret encryption/storage
- tenant settings framework
- provider abstraction cleanup in the conversion pipeline
Risks
- Per-tenant provider routing increases operational complexity.
- Provider-specific edge cases can leak into product behavior.
- Some providers may have inconsistent model naming or permissions APIs.
Acceptance Criteria
- A tenant can add and validate at least one vision-capable provider from Settings.
- A conversion job for that tenant uses the tenantβs provider instead of hosted credentials.
- Hosted fallback behavior is enforced exactly as configured.
- Logs and admin tools show which provider handled each job.
Estimated Effort
- Backend and provider abstraction: 5-7 days
- Frontend settings UI: 2-3 days
- Validation, testing, runbooks: 2-3 days
- Total: 9-13 days