Skip to content

Tax Exemption AI Review Pipeline

Automated extraction + decision engine that runs on every uploaded tax exemption certificate, producing a green/yellow/red recommendation that admins can accept, override, or ignore. Shadow mode by default β€” humans remain the source of truth until EXEMPTION_AUTO_APPROVE=true is flipped.

Source of truth for the pipeline lives in packages/exemption-review/ (framework-agnostic) with Supabase/R2/Anthropic adapters in workers/api/src/services/exemption-review/adapters.ts.

Architecture

upload β†’ extract (Claude vision) β†’ verify (state DB) β†’ decide (rules)
β”‚ β”‚ β”‚
β–Ό β–Ό β–Ό
tax_certificate_ tax_certificate_permit_ tax_certificates
extractions verifications (decision_tier,
(raw_response, (state DB match, decision_reason,
tokens, facts) lookup_source) ai_confidence)
β”‚
β–Ό
tax_certificate_review_log (append-only event trail)
β”‚
β–Ό
cost_ledger (estimated USD per Anthropic call)

Decision tiers

  • Green β€” all rules pass (confidence β‰₯ threshold, name match, state permit active, no red flags). Eligible for auto-approval.
  • Yellow β€” ambiguous. Always routed to human review regardless of EXEMPTION_AUTO_APPROVE.
  • Red β€” hard fail (forgery detected, expired cert, name mismatch, invalid document type). Eligible for auto-rejection.

Configuration (env vars)

VariableDefaultEffect
EXEMPTION_AUTO_APPROVEfalseWhen true, green β†’ auto-approved, red β†’ auto-rejected. Yellow still routes to humans. Shadow mode otherwise.
EXEMPTION_NEXUS_STATESTXComma-separated 2-letter state codes where we have sales-tax nexus. Unverifiable permits in nexus states are riskier.
EXEMPTION_MODELclaude-sonnet-4-6Anthropic model id. Must exist in workers/api/src/utils/llm-cost.ts pricing table or cost tracking falls back to Sonnet pricing.
ANTHROPIC_API_KEYrequiredPipeline is skipped entirely (with warning) if unset.

Data model

Tables

  • tax_certificates β€” canonical record. Gained these columns in migration 067: decision_tier, decision_reason, ai_reviewed_at, ai_confidence, reviewed_by_kind. Not subject to the 90-day purge.
  • tax_certificate_extractions β€” append-only. One row per extraction attempt. Stores the full raw_response JSON plus parsed fields. Contains PII: buyer name, tax ID/EIN, address.
  • tax_certificate_permit_verifications β€” append-only. Result of state-database lookups (e.g. Texas Comptroller search).
  • tax_certificate_review_log β€” append-only event stream (extraction_started, decided_green, auto_approved, etc.).

Row-level security

Migration 20260409_069_tax_exemption_hardening.sql enables RLS on the three sidecar tables with admin-only SELECT:

EXISTS (
SELECT 1 FROM public.profiles
WHERE profiles.id = auth.uid() AND profiles.role = 'admin'
)
  • Service role (pipeline, adapters, purge job) bypasses RLS automatically β€” no explicit policy needed for writes.
  • Admin dashboard uses an authenticated session with profiles.role = 'admin' and can SELECT via the policy.
  • Regular users cannot read any row from the sidecar tables, even their own certificate’s extraction. The admin dashboard is the only user-facing surface.

Before adding any new endpoint that reads these tables from a non-admin context, verify the RLS policy still matches β€” or the table will refuse the query silently.

Cost tracking

Every Anthropic call in the pipeline records to cost_ledger with operation_type = 'tax-exemption-extraction'. The pricing table in workers/api/src/utils/llm-cost.ts converts tokens β†’ USD; the package itself is infrastructure-agnostic and emits CostEvents through a callback wired by adapters.ts.

Parse failures still incur cost and are still recorded β€” the metadata column includes parseFailed: true so you can distinguish successful vs failed extractions in reports.

Reporting queries

Total spend last 30 days:

SELECT
SUM(estimated_cost_usd) AS total_usd,
COUNT(*) AS extractions,
SUM(CASE WHEN (metadata->>'parseFailed')::boolean THEN 1 ELSE 0 END) AS parse_failures
FROM public.cost_ledger
WHERE operation_type = 'tax-exemption-extraction'
AND created_at >= NOW() - INTERVAL '30 days';

Per-certificate cost (joins against metadata):

SELECT
metadata->>'certificateId' AS certificate_id,
COUNT(*) AS api_calls,
SUM(estimated_cost_usd) AS cost_usd
FROM public.cost_ledger
WHERE operation_type = 'tax-exemption-extraction'
GROUP BY metadata->>'certificateId'
ORDER BY cost_usd DESC
LIMIT 50;

Reuse the global get_cost_summary RPC for admin dashboard integration:

SELECT * FROM get_cost_summary(
p_start_date => NOW() - INTERVAL '30 days',
p_group_by => 'operation'
);

The tax-exemption-extraction row is now included alongside other operation types.

90-day retention purge

Policy

Extraction rows (containing PII) are retained for 90 days from extracted_at. Permit verifications and review-log entries are purged on the same schedule so sidecar data stays consistent with the extractions window. The canonical tax_certificates row is NOT touched β€” it has its own retention driven by commerce/tax compliance rules.

Purge function

SELECT * FROM public.purge_tax_exemption_data(); -- 90 days (default)
SELECT * FROM public.purge_tax_exemption_data(30); -- override window

Returns per-table deletion counts:

extractions_deleted | permit_verifications_deleted | review_log_deleted
---------------------+------------------------------+--------------------
142 | 87 | 503

The function is SECURITY DEFINER, GRANT EXECUTE to service_role only. Admin dashboard users cannot run it directly β€” they invoke it via a service-role RPC from the API layer.

Scheduling

Pick one:

  1. pg_cron (preferred if the extension is installed on the Supabase instance):

    SELECT cron.schedule(
    'purge-tax-exemption-data-nightly',
    '15 2 * * *', -- 02:15 UTC nightly
    $$ SELECT public.purge_tax_exemption_data(); $$
    );
  2. Ops cron via the nightly audit script (~/.claude/scripts/nightly-audit.sh): add a step that calls the Supabase REST endpoint for the function using the service-role key. Log the returned counts to the audit email so drift is visible.

  3. Manual β€” run the SQL above weekly during rollout, switch to #1 or #2 before auto-approve is enabled at scale.

Verification

Confirm the purge is running:

-- Oldest row should be ≀ 90 days
SELECT MIN(extracted_at) AS oldest,
NOW() - MIN(extracted_at) AS age
FROM public.tax_certificate_extractions;

If age exceeds 91 days, the purge is not running β€” investigate before PII accumulates.

Operational runbook

Turning auto-approve on/off

Terminal window
# Shadow mode (default):
EXEMPTION_AUTO_APPROVE=false
# Full auto:
EXEMPTION_AUTO_APPROVE=true

Before flipping to true:

  1. Review at least 100 shadow-mode decisions and confirm the green/red tier distribution matches human decisions β‰₯ 95% of the time.
  2. Confirm the 90-day purge is scheduled (see above).
  3. Confirm cost tracking is visible in the admin cost dashboard.
  4. Review the most recent prompt-injection red flags in the review_log (event = 'extraction_succeeded' with red_flags containing prompt_injection_attempt).

Re-running extraction for a certificate

Extractions are append-only β€” the pipeline always reads the most recent row per certificate_id. To force a re-run (e.g. after a prompt change), invoke the admin endpoint or call runReview(certificateId, deps) directly. Do NOT manipulate tax_certificate_extractions rows by hand; add a new row via the pipeline so the audit trail stays clean.

Debugging a specific certificate

-- Latest extraction + facts
SELECT extracted_at, model, confidence, parse_error,
document_type, buyer_entity_name, buyer_tax_id, red_flags
FROM public.tax_certificate_extractions
WHERE certificate_id = 'CERT_ID_HERE'
ORDER BY extracted_at DESC
LIMIT 1;
-- Full event timeline
SELECT created_at, event, actor, details
FROM public.tax_certificate_review_log
WHERE certificate_id = 'CERT_ID_HERE'
ORDER BY created_at DESC;
-- Costs incurred
SELECT created_at, model, input_tokens, output_tokens,
estimated_cost_usd, metadata
FROM public.cost_ledger
WHERE operation_type = 'tax-exemption-extraction'
AND metadata->>'certificateId' = 'CERT_ID_HERE'
ORDER BY created_at DESC;

Known limitations

  • Texas-only state verification. verifiers.ts currently scrapes only comptroller.texas.gov. Other states fall through to lookup_source = 'unavailable' and the decision engine treats that as yellow-tier in nexus states.
  • Prompt injection mitigations are defense in depth, not absolute. The SYSTEM_PROMPT instructs the model to treat document contents as untrusted data and the USER_PROMPT block is placed before documents, but auto-approve should always be gated on the state DB match, not solely on model-reported confidence. See packages/exemption-review/src/extract.ts for the current prompt.
  • Shadow-mode backfill. Flipping EXEMPTION_AUTO_APPROVE=true does NOT retroactively apply decisions to certificates reviewed in shadow mode β€” it only affects new uploads. Run a manual re-review if you need to apply the new policy to existing needs_review certs.