Tax Exemption AI Review Pipeline

Automated extraction + decision engine that runs on every uploaded tax exemption certificate, producing a green/yellow/red recommendation that admins can accept, override, or ignore. Shadow mode by default — humans remain the source of truth until EXEMPTION_AUTO_APPROVE=true is flipped.

Source of truth for the pipeline lives in packages/exemption-review/ (framework-agnostic) with Supabase/R2/Anthropic adapters in workers/api/src/services/exemption-review/adapters.ts.

Architecture

upload → extract (Claude vision) → verify (state DB) → decide (rules)
              │                           │                  │
              ▼                           ▼                  ▼
    tax_certificate_       tax_certificate_permit_   tax_certificates
    extractions            verifications              (decision_tier,
    (raw_response,         (state DB match,            decision_reason,
     tokens, facts)         lookup_source)              ai_confidence)
              │
              ▼
    tax_certificate_review_log (append-only event trail)
              │
              ▼
    cost_ledger (estimated USD per Anthropic call)

Decision tiers

Green — all rules pass (confidence ≥ threshold, name match, state permit active, no red flags). Eligible for auto-approval.
Yellow — ambiguous. Always routed to human review regardless of EXEMPTION_AUTO_APPROVE.
Red — hard fail (forgery detected, expired cert, name mismatch, invalid document type). Eligible for auto-rejection.

Configuration (env vars)

Variable	Default	Effect
`EXEMPTION_AUTO_APPROVE`	`false`	When `true`, green → auto-approved, red → auto-rejected. Yellow still routes to humans. Shadow mode otherwise.
`EXEMPTION_NEXUS_STATES`	`TX`	Comma-separated 2-letter state codes where we have sales-tax nexus. Unverifiable permits in nexus states are riskier.
`EXEMPTION_MODEL`	`claude-sonnet-4-6`	Anthropic model id. Must exist in `workers/api/src/utils/llm-cost.ts` pricing table or cost tracking falls back to Sonnet pricing.
`ANTHROPIC_API_KEY`	required	Pipeline is skipped entirely (with warning) if unset.

Data model

Tables

tax_certificates — canonical record. Gained these columns in migration 067: decision_tier, decision_reason, ai_reviewed_at, ai_confidence, reviewed_by_kind. Not subject to the 90-day purge.
tax_certificate_extractions — append-only. One row per extraction attempt. Stores the full raw_response JSON plus parsed fields. Contains PII: buyer name, tax ID/EIN, address.
tax_certificate_permit_verifications — append-only. Result of state-database lookups (e.g. Texas Comptroller search).
tax_certificate_review_log — append-only event stream (extraction_started, decided_green, auto_approved, etc.).

Row-level security

Migration 20260409_069_tax_exemption_hardening.sql enables RLS on the three sidecar tables with admin-only SELECT:

EXISTS (
  SELECT 1 FROM public.profiles
  WHERE profiles.id = auth.uid() AND profiles.role = 'admin'
)

Service role (pipeline, adapters, purge job) bypasses RLS automatically — no explicit policy needed for writes.
Admin dashboard uses an authenticated session with profiles.role = 'admin' and can SELECT via the policy.
Regular users cannot read any row from the sidecar tables, even their own certificate’s extraction. The admin dashboard is the only user-facing surface.

Before adding any new endpoint that reads these tables from a non-admin context, verify the RLS policy still matches — or the table will refuse the query silently.

Cost tracking

Every Anthropic call in the pipeline records to cost_ledger with operation_type = 'tax-exemption-extraction'. The pricing table in workers/api/src/utils/llm-cost.ts converts tokens → USD; the package itself is infrastructure-agnostic and emits CostEvents through a callback wired by adapters.ts.

Parse failures still incur cost and are still recorded — the metadata column includes parseFailed: true so you can distinguish successful vs failed extractions in reports.

Reporting queries

Total spend last 30 days:

SELECT
  SUM(estimated_cost_usd) AS total_usd,
  COUNT(*) AS extractions,
  SUM(CASE WHEN (metadata->>'parseFailed')::boolean THEN 1 ELSE 0 END) AS parse_failures
FROM public.cost_ledger
WHERE operation_type = 'tax-exemption-extraction'
  AND created_at >= NOW() - INTERVAL '30 days';

Per-certificate cost (joins against metadata):

SELECT
  metadata->>'certificateId' AS certificate_id,
  COUNT(*) AS api_calls,
  SUM(estimated_cost_usd) AS cost_usd
FROM public.cost_ledger
WHERE operation_type = 'tax-exemption-extraction'
GROUP BY metadata->>'certificateId'
ORDER BY cost_usd DESC
LIMIT 50;

Reuse the global get_cost_summary RPC for admin dashboard integration:

SELECT * FROM get_cost_summary(
  p_start_date => NOW() - INTERVAL '30 days',
  p_group_by => 'operation'
);

The tax-exemption-extraction row is now included alongside other operation types.

90-day retention purge

Policy

Extraction rows (containing PII) are retained for 90 days from extracted_at. Permit verifications and review-log entries are purged on the same schedule so sidecar data stays consistent with the extractions window. The canonical tax_certificates row is NOT touched — it has its own retention driven by commerce/tax compliance rules.

Purge function

SELECT * FROM public.purge_tax_exemption_data();            -- 90 days (default)
SELECT * FROM public.purge_tax_exemption_data(30);          -- override window

Returns per-table deletion counts:

 extractions_deleted | permit_verifications_deleted | review_log_deleted
---------------------+------------------------------+--------------------
                 142 |                           87 |               503

The function is SECURITY DEFINER, GRANT EXECUTE to service_role only. Admin dashboard users cannot run it directly — they invoke it via a service-role RPC from the API layer.

Scheduling

Pick one:

pg_cron (preferred if the extension is installed on the Supabase instance):

SELECT cron.schedule(
  'purge-tax-exemption-data-nightly',
  '15 2 * * *',  -- 02:15 UTC nightly
  $$ SELECT public.purge_tax_exemption_data(); $$
);

Ops cron via the nightly audit script (~/.claude/scripts/nightly-audit.sh): add a step that calls the Supabase REST endpoint for the function using the service-role key. Log the returned counts to the audit email so drift is visible.
Manual — run the SQL above weekly during rollout, switch to #1 or #2 before auto-approve is enabled at scale.

Verification

Confirm the purge is running:

-- Oldest row should be ≤ 90 days
SELECT MIN(extracted_at) AS oldest,
       NOW() - MIN(extracted_at) AS age
FROM public.tax_certificate_extractions;

If age exceeds 91 days, the purge is not running — investigate before PII accumulates.

Operational runbook

Turning auto-approve on/off

# Shadow mode (default):
EXEMPTION_AUTO_APPROVE=false

# Full auto:
EXEMPTION_AUTO_APPROVE=true

Before flipping to true:

Review at least 100 shadow-mode decisions and confirm the green/red tier distribution matches human decisions ≥ 95% of the time.
Confirm the 90-day purge is scheduled (see above).
Confirm cost tracking is visible in the admin cost dashboard.
Review the most recent prompt-injection red flags in the review_log (event = 'extraction_succeeded' with red_flags containing prompt_injection_attempt).

Re-running extraction for a certificate

Extractions are append-only — the pipeline always reads the most recent row per certificate_id. To force a re-run (e.g. after a prompt change), invoke the admin endpoint or call runReview(certificateId, deps) directly. Do NOT manipulate tax_certificate_extractions rows by hand; add a new row via the pipeline so the audit trail stays clean.

Debugging a specific certificate

-- Latest extraction + facts
SELECT extracted_at, model, confidence, parse_error,
       document_type, buyer_entity_name, buyer_tax_id, red_flags
FROM public.tax_certificate_extractions
WHERE certificate_id = 'CERT_ID_HERE'
ORDER BY extracted_at DESC
LIMIT 1;

-- Full event timeline
SELECT created_at, event, actor, details
FROM public.tax_certificate_review_log
WHERE certificate_id = 'CERT_ID_HERE'
ORDER BY created_at DESC;

-- Costs incurred
SELECT created_at, model, input_tokens, output_tokens,
       estimated_cost_usd, metadata
FROM public.cost_ledger
WHERE operation_type = 'tax-exemption-extraction'
  AND metadata->>'certificateId' = 'CERT_ID_HERE'
ORDER BY created_at DESC;

Known limitations

Texas-only state verification. verifiers.ts currently scrapes only comptroller.texas.gov. Other states fall through to lookup_source = 'unavailable' and the decision engine treats that as yellow-tier in nexus states.
Prompt injection mitigations are defense in depth, not absolute. The SYSTEM_PROMPT instructs the model to treat document contents as untrusted data and the USER_PROMPT block is placed before documents, but auto-approve should always be gated on the state DB match, not solely on model-reported confidence. See packages/exemption-review/src/extract.ts for the current prompt.
Shadow-mode backfill. Flipping EXEMPTION_AUTO_APPROVE=true does NOT retroactively apply decisions to certificates reviewed in shadow mode — it only affects new uploads. Run a manual re-review if you need to apply the new policy to existing needs_review certs.