Tax Exemption AI Review Pipeline
Automated extraction + decision engine that runs on every uploaded tax
exemption certificate, producing a green/yellow/red recommendation that
admins can accept, override, or ignore. Shadow mode by default β humans
remain the source of truth until EXEMPTION_AUTO_APPROVE=true is flipped.
Source of truth for the pipeline lives in packages/exemption-review/
(framework-agnostic) with Supabase/R2/Anthropic adapters in
workers/api/src/services/exemption-review/adapters.ts.
Architecture
upload β extract (Claude vision) β verify (state DB) β decide (rules) β β β βΌ βΌ βΌ tax_certificate_ tax_certificate_permit_ tax_certificates extractions verifications (decision_tier, (raw_response, (state DB match, decision_reason, tokens, facts) lookup_source) ai_confidence) β βΌ tax_certificate_review_log (append-only event trail) β βΌ cost_ledger (estimated USD per Anthropic call)Decision tiers
- Green β all rules pass (confidence β₯ threshold, name match, state permit active, no red flags). Eligible for auto-approval.
- Yellow β ambiguous. Always routed to human review regardless of
EXEMPTION_AUTO_APPROVE. - Red β hard fail (forgery detected, expired cert, name mismatch, invalid document type). Eligible for auto-rejection.
Configuration (env vars)
| Variable | Default | Effect |
|---|---|---|
EXEMPTION_AUTO_APPROVE | false | When true, green β auto-approved, red β auto-rejected. Yellow still routes to humans. Shadow mode otherwise. |
EXEMPTION_NEXUS_STATES | TX | Comma-separated 2-letter state codes where we have sales-tax nexus. Unverifiable permits in nexus states are riskier. |
EXEMPTION_MODEL | claude-sonnet-4-6 | Anthropic model id. Must exist in workers/api/src/utils/llm-cost.ts pricing table or cost tracking falls back to Sonnet pricing. |
ANTHROPIC_API_KEY | required | Pipeline is skipped entirely (with warning) if unset. |
Data model
Tables
tax_certificatesβ canonical record. Gained these columns in migration 067:decision_tier,decision_reason,ai_reviewed_at,ai_confidence,reviewed_by_kind. Not subject to the 90-day purge.tax_certificate_extractionsβ append-only. One row per extraction attempt. Stores the fullraw_responseJSON plus parsed fields. Contains PII: buyer name, tax ID/EIN, address.tax_certificate_permit_verificationsβ append-only. Result of state-database lookups (e.g. Texas Comptroller search).tax_certificate_review_logβ append-only event stream (extraction_started,decided_green,auto_approved, etc.).
Row-level security
Migration 20260409_069_tax_exemption_hardening.sql enables RLS on the
three sidecar tables with admin-only SELECT:
EXISTS ( SELECT 1 FROM public.profiles WHERE profiles.id = auth.uid() AND profiles.role = 'admin')- Service role (pipeline, adapters, purge job) bypasses RLS automatically β no explicit policy needed for writes.
- Admin dashboard uses an authenticated session with
profiles.role = 'admin'and can SELECT via the policy. - Regular users cannot read any row from the sidecar tables, even their own certificateβs extraction. The admin dashboard is the only user-facing surface.
Before adding any new endpoint that reads these tables from a non-admin context, verify the RLS policy still matches β or the table will refuse the query silently.
Cost tracking
Every Anthropic call in the pipeline records to cost_ledger with
operation_type = 'tax-exemption-extraction'. The pricing table in
workers/api/src/utils/llm-cost.ts converts tokens β USD; the package
itself is infrastructure-agnostic and emits CostEvents through a
callback wired by adapters.ts.
Parse failures still incur cost and are still recorded β the metadata
column includes parseFailed: true so you can distinguish successful vs
failed extractions in reports.
Reporting queries
Total spend last 30 days:
SELECT SUM(estimated_cost_usd) AS total_usd, COUNT(*) AS extractions, SUM(CASE WHEN (metadata->>'parseFailed')::boolean THEN 1 ELSE 0 END) AS parse_failuresFROM public.cost_ledgerWHERE operation_type = 'tax-exemption-extraction' AND created_at >= NOW() - INTERVAL '30 days';Per-certificate cost (joins against metadata):
SELECT metadata->>'certificateId' AS certificate_id, COUNT(*) AS api_calls, SUM(estimated_cost_usd) AS cost_usdFROM public.cost_ledgerWHERE operation_type = 'tax-exemption-extraction'GROUP BY metadata->>'certificateId'ORDER BY cost_usd DESCLIMIT 50;Reuse the global get_cost_summary RPC for admin dashboard
integration:
SELECT * FROM get_cost_summary( p_start_date => NOW() - INTERVAL '30 days', p_group_by => 'operation');The tax-exemption-extraction row is now included alongside other
operation types.
90-day retention purge
Policy
Extraction rows (containing PII) are retained for 90 days from
extracted_at. Permit verifications and review-log entries are purged
on the same schedule so sidecar data stays consistent with the
extractions window. The canonical tax_certificates row is NOT touched
β it has its own retention driven by commerce/tax compliance rules.
Purge function
SELECT * FROM public.purge_tax_exemption_data(); -- 90 days (default)SELECT * FROM public.purge_tax_exemption_data(30); -- override windowReturns per-table deletion counts:
extractions_deleted | permit_verifications_deleted | review_log_deleted---------------------+------------------------------+-------------------- 142 | 87 | 503The function is SECURITY DEFINER, GRANT EXECUTE to service_role
only. Admin dashboard users cannot run it directly β they invoke it via
a service-role RPC from the API layer.
Scheduling
Pick one:
-
pg_cron (preferred if the extension is installed on the Supabase instance):
SELECT cron.schedule('purge-tax-exemption-data-nightly','15 2 * * *', -- 02:15 UTC nightly$$ SELECT public.purge_tax_exemption_data(); $$); -
Ops cron via the nightly audit script (
~/.claude/scripts/nightly-audit.sh): add a step that calls the Supabase REST endpoint for the function using the service-role key. Log the returned counts to the audit email so drift is visible. -
Manual β run the SQL above weekly during rollout, switch to #1 or #2 before auto-approve is enabled at scale.
Verification
Confirm the purge is running:
-- Oldest row should be β€ 90 daysSELECT MIN(extracted_at) AS oldest, NOW() - MIN(extracted_at) AS ageFROM public.tax_certificate_extractions;If age exceeds 91 days, the purge is not running β investigate before
PII accumulates.
Operational runbook
Turning auto-approve on/off
# Shadow mode (default):EXEMPTION_AUTO_APPROVE=false
# Full auto:EXEMPTION_AUTO_APPROVE=trueBefore flipping to true:
- Review at least 100 shadow-mode decisions and confirm the green/red tier distribution matches human decisions β₯ 95% of the time.
- Confirm the 90-day purge is scheduled (see above).
- Confirm cost tracking is visible in the admin cost dashboard.
- Review the most recent prompt-injection red flags in the review_log
(
event = 'extraction_succeeded'withred_flagscontainingprompt_injection_attempt).
Re-running extraction for a certificate
Extractions are append-only β the pipeline always reads the most recent
row per certificate_id. To force a re-run (e.g. after a prompt
change), invoke the admin endpoint or call runReview(certificateId, deps) directly. Do NOT manipulate tax_certificate_extractions rows by
hand; add a new row via the pipeline so the audit trail stays clean.
Debugging a specific certificate
-- Latest extraction + factsSELECT extracted_at, model, confidence, parse_error, document_type, buyer_entity_name, buyer_tax_id, red_flagsFROM public.tax_certificate_extractionsWHERE certificate_id = 'CERT_ID_HERE'ORDER BY extracted_at DESCLIMIT 1;
-- Full event timelineSELECT created_at, event, actor, detailsFROM public.tax_certificate_review_logWHERE certificate_id = 'CERT_ID_HERE'ORDER BY created_at DESC;
-- Costs incurredSELECT created_at, model, input_tokens, output_tokens, estimated_cost_usd, metadataFROM public.cost_ledgerWHERE operation_type = 'tax-exemption-extraction' AND metadata->>'certificateId' = 'CERT_ID_HERE'ORDER BY created_at DESC;Known limitations
- Texas-only state verification.
verifiers.tscurrently scrapes onlycomptroller.texas.gov. Other states fall through tolookup_source = 'unavailable'and the decision engine treats that as yellow-tier in nexus states. - Prompt injection mitigations are defense in depth, not absolute.
The SYSTEM_PROMPT instructs the model to treat document contents as
untrusted data and the USER_PROMPT block is placed before documents,
but auto-approve should always be gated on the state DB match, not
solely on model-reported confidence. See
packages/exemption-review/src/extract.tsfor the current prompt. - Shadow-mode backfill. Flipping
EXEMPTION_AUTO_APPROVE=truedoes NOT retroactively apply decisions to certificates reviewed in shadow mode β it only affects new uploads. Run a manual re-review if you need to apply the new policy to existingneeds_reviewcerts.