Editor Validation, Revisions, and Manual Overrides
Reference for the post-conversion HTML editor in apps/web (preview page): how edits are saved and versioned, how WCAG re-validation runs, when the PDF is regenerated, and how evaluators record manual overrides for findings that automated tooling gets wrong or that don’t apply.
This document covers the architecture and operational behavior. The end-user walkthrough lives in docs/user/editing-pdfs-and-overrides.md.
Edit → save → re-validate → refresh PDF: the whole loop
[user edits in iframe] │ (3s debounce) ▼POST /api/files/:id/validate ──► validateWCAG() (pure DOM/regex linter) │ ──► fetch active file_violation_overrides │ ──► compute score AND scoreWithOverrides ▼panel re-renders (active vs. reviewed partition, score delta visible)
[user clicks Save] ▼PUT /api/files/:id/html ──► snapshotHtmlVersion() to R2 ──► INSERT html_edits row (version index) ──► overwrite live R2 object ──► prune to most-recent 20 versions
[user clicks Refresh PDF] ▼POST /api/files/:id/refresh-pdf ──► forceAccessiblePdfExport() ──► WeasyPrint generate → VeraPDF validate ──► scorePdfAccessibility (PDF/UA structural) ──► persist accessiblePdfScore on file rowCritical timing notes:
- Debounced validation is free.
useWcagValidationfiresvalidateHtml3s after the last edit (packages/editor/src/hooks/useWcagValidation.ts:6). The validate route runsvalidateWCAG()— a local DOM/regex linter (workers/api/src/services/wcag-validator.ts:1). No AI calls. A user editing 50 times in a session costs ~$0 in AI. - AI is reserved for initial audit. Today AI deepening (
workers/batch/src/audit-ai-deepen.ts) only runs in the URL audit pipeline (workers/batch/src/url-fetch-executor.ts:864). PDF audits are compute-only. The override architecture below assumes AI findings can exist (so the design generalizes when AI deepening lands for PDFs). - Persisted score only moves on Refresh PDF.
files.accessiblePdfScoreis updated byscorePdfAccessibility()inaccessible-pdf-inline.ts:134, not by edit-time validation. The live panel shows the just-computed score; the file row lags until refresh.
Storage: revisions
Versioned in html_edits (migration supabase/migrations/20260517_109_html_edits.sql).
| Column | Notes |
|---|---|
file_id | FK to files. ON DELETE CASCADE. |
user_id | Service-role only access; not exposed to authenticated clients. |
version_r2_key | R2 path: versions/{timestamp}-{source}.html |
byte_size | Size of the snapshot. |
source | `‘edit' |
created_at | Snapshot timestamp. |
Behavior in workers/api/src/routes/files.ts:1075 (PUT /:fileId/html):
- The prior R2 object is snapshotted to
versions/…before being overwritten — so the version table indexes prior states, not the new one. - After every save, the route prunes to the most recent 20 versions per file (
files.ts:1176). - Restore (
files.ts:1274) calls the same snapshot helper first — so even a restore is itself reversible.
Listing/fetching versions:
GET /api/files/:fileId/html/versions— list (max 100 returned)GET /api/files/:fileId/html/versions/:versionId— fetch raw HTML blobPOST /api/files/:fileId/html/versions/:versionId/restore— restore + snapshot current first
Storage: manual overrides
Two tables, both in migration supabase/migrations/20260524_123_file_violation_overrides.sql.
file_violation_overrides — active state
| Column | Notes |
|---|---|
fingerprint | Stable identifier (see below). Computed client-side, stored verbatim. |
rule_id | WCAG/axe rule the override applies to. |
status | `resolved |
justification | NOT NULL, non-empty (CHECK on length(btrim(...)) > 0). |
wcag_technique | Optional reference like H67, ARIA14. |
canned_override_id | If the user picked a canned override, its catalog id. |
selector_snapshot | Selector at time of override (forensic / report-friendly). |
element_html_snapshot | First-node HTML at time of override (capped at 4 KB). |
overridden_by | User id (currently the auth subject; no admin-vs-editor split yet). |
revoked_at / _by / _reason | Soft revoke. Active rows have revoked_at IS NULL. |
Partial unique index (file_id, fingerprint) WHERE revoked_at IS NULL enforces one active override per finding while still allowing a revoked override to be replaced later. RLS is on, service role only.
file_violation_override_log — append-only audit log
Every create / update / revoke writes one row keyed to the override id. This is what the Manually Reviewed Findings section of the ACR pulls reviewer attribution from. Never truncate or DELETE FROM this table — it’s the legal trail for the conformance claim.
Fingerprint algorithm (the load-bearing detail)
fingerprint = `${ruleId}::${fnv1a(selector.toLowerCase())}::${fnv1a(collapseWs(elementHtml).slice(0,512))}`- FNV-1a, 32-bit, hex-padded to 8 chars.
- Selector is the first node’s first target string, lowercased.
- Element HTML is the first node’s HTML, whitespace-collapsed, truncated to 512 chars.
Two implementations must stay byte-identical:
packages/shared/src/violation-fingerprint.ts— server-side (fingerprintWCAGViolation(WCAGViolation))packages/editor/src/data/common-overrides.ts— client-side (fingerprintViolation(EditorViolation))
If they ever drift, overrides written from the editor will fail to attach to server-computed violations and re-surface as “active” — see Troubleshooting below.
Why fingerprint, not violation-id
A WCAG run produces a new in-memory violation list on every call; the violation objects have no persistent id. Fingerprinting is what lets an override created on Tuesday still attach to “the same” finding on Friday after the user edited other parts of the file. If the user fixes the underlying HTML, the element snippet changes, the fingerprint misses, and the violation correctly re-appears as active — exactly the behavior we want.
Score recomputation
Two scorers live in packages/shared/src/scoring.ts:
computeAccessibilityScore(wcagStatus)— base rule-pass-rate over all evaluated rules.computeAccessibilityScoreWithOverrides(wcagStatus, overrides)— for each failing rule, if every violation under it has an active override whose status lifts the score, treat the rule as effectively passing.
Status → score impact:
| Status | Lifts score? | Use case |
|---|---|---|
resolved | yes | ”I fixed it; the checker is wrong about it still failing.” |
not_applicable | yes | Decorative content, logo exempt from contrast, etc. |
false_positive | yes | Checker tripped on a valid pattern. |
wont_fix | no | Acknowledged tech debt. Stays in the deduction so it’s visible. |
The wont_fix semantic is deliberate. Accepted debt should drag the grade so it shows up as a number an evaluator can defend, not as a silent free pass.
The validate endpoint (workers/api/src/routes/files.ts:1463) returns both score and scoreWithOverrides so the UI can show the lift. The persisted files.accessiblePdfScore is a different scorer (PDF/UA structural conformance from scorePdfAccessibility) and is not affected by HTML-edit overrides — only by Refresh PDF.
The 17 canned “instant overrides”
Defined in packages/editor/src/data/common-overrides.ts. The editor surfaces up to three relevant chips per row (matched against ruleId), plus a “Custom override…” button that opens the justification form.
| Canned id | Status | One-line summary |
|---|---|---|
decorative-image | not_applicable | Empty alt is correct (WCAG H67). Prefers fix-in-place. |
alt-text-reviewed | resolved | Human-reviewed; alt accurately describes the image (G94). |
redundant-alt-intentional | false_positive | Adjacent text duplicates alt by design (caption + figure). |
link-purpose-clear-in-context | resolved | Purpose determinable from surrounding context (H77). |
icon-button-labeled | resolved | Name via aria-label / aria-labelledby (ARIA14). |
contrast-logo-or-brand | not_applicable | Logotype — exempt from 1.4.3. |
contrast-incidental-text | not_applicable | Disabled / decorative / invisible text — exempt from 1.4.3. |
contrast-verified-manually | resolved | Measured against true rendered background; false negative. |
heading-skip-intentional | false_positive | Hierarchy reflects source PDF outline. |
lang-attribute-source | resolved | Lang matches primary language of source PDF (H57). |
table-layout-not-data | not_applicable | role="presentation" layout table (F46). |
duplicate-id-from-source | wont_fix | Source-PDF artifact; rewriting breaks anchors. No score lift. |
aria-attribute-intentional | resolved | Correct for composite widget; checker doesn’t recognize the pattern. |
form-label-visible-adjacent | resolved | Labeled via aria-labelledby; layout precludes wrapping <label> (ARIA16). |
frame-title-decorative | not_applicable | Iframe aria-hidden / display:none. |
landmark-single-page-app | resolved | Manually verified; checker heuristic false positive. |
pdf-source-tracked-elsewhere | wont_fix | Will be fixed in source PDF; tracked separately. No score lift. |
The “fix-in-place vs. override” pattern
The decorative-image canned override declares preferFixInPlace. When the editor renders chips for an image-alt finding, the row shows two affordances:
- Mark decorative in HTML (recommended) — primary blue button. Calls back into the editor bridge to set
alt=""androle="presentation"on the element. The next debounced re-validation drops the violation naturally. No row written tofile_violation_overrides. - Mark reviewed… — the standard override flow, used when the markup can’t be changed (e.g., we’re auditing rendered HTML from a third-party source).
We prefer fix-in-place because the artifact becomes conformant, not just the report. Overrides are a record of human judgment; they should not be a way to make the HTML lie. When you can change the HTML to express the intent (decorative = empty alt + presentation role), do that.
API surface
All under workers/api/src/routes/files.ts:
| Method | Path | Purpose |
|---|---|---|
| POST | /api/files/:id/validate | Compute-only WCAG validation. Returns overrides, score, scoreWithOverrides. |
| GET | /api/files/:id/overrides | List active overrides for the file. |
| POST | /api/files/:id/overrides | Upsert by fingerprint. Writes created or updated row to _log. |
| DELETE | /api/files/:id/overrides/:overrideId?reason=… | Soft revoke. Writes revoked row to _log. |
| PUT | /api/files/:id/html | Save HTML. Snapshots prior to R2 and writes html_edits row. |
| POST | /api/files/:id/refresh-pdf | Regenerate accessible PDF; updates persisted accessiblePdfScore. |
Endpoints constants live in packages/shared/src/constants.ts (FILES_OVERRIDES, FILES_OVERRIDE). The web adapter is in apps/web/src/utils/editorAdapters.ts (makeOverrideCallbacks). The shared hook is useViolationOverrides from @accessible-pdf/editor.
Conformance report (ACR) integration
workers/api/src/services/acr-report-renderer.ts defines renderManuallyReviewedFindings(). The ACR data type gains a manuallyReviewedFindings?: AcrManualReviewEntry[] field (packages/shared/src/types.ts). The renderer outputs a “Manually Reviewed Findings” section with one row per active override, including:
- Rule id + WCAG criterion
- Check description + element selector (collapsed monospace)
- Status badge (Resolved / Not Applicable / False Positive / Accepted (Won’t Fix))
- Full justification text
- Reviewer + ISO date
The section renders only when manuallyReviewedFindings.length > 0. Callers that build AcrReportData are responsible for joining file_violation_overrides into this field — wire this in whichever report-build flow you’re touching (PDF refresh, ACR download endpoint, etc.).
Key files
- Migrations:
supabase/migrations/20260517_109_html_edits.sql,supabase/migrations/20260524_123_file_violation_overrides.sql - Shared scoring + fingerprint:
packages/shared/src/scoring.ts,packages/shared/src/violation-fingerprint.ts - Shared types:
packages/shared/src/types.ts(AcrManualReviewEntry) - Editor catalog + hook:
packages/editor/src/data/common-overrides.ts,packages/editor/src/hooks/useViolationOverrides.ts - Editor panel:
packages/editor/src/components/WcagPanel.tsx(Reviewed tab, OverrideControls, ReviewedRowBody) - API routes:
workers/api/src/routes/files.ts(validate, overrides CRUD) - Validator:
workers/api/src/services/wcag-validator.ts - ACR renderer:
workers/api/src/services/acr-report-renderer.ts - Web adapter:
apps/web/src/utils/editorAdapters.ts,apps/web/src/lib/api.ts
Operational notes
”My override disappeared after a re-validation”
Almost always fingerprint drift: the violation’s first-node selector or element HTML changed enough that the fresh fingerprint no longer matches the stored row.
Diagnose:
-- The stored snapshotSELECT fingerprint, selector_snapshot, left(element_html_snapshot, 200) AS snippetFROM file_violation_overridesWHERE file_id = $1 AND rule_id = $2 AND revoked_at IS NULL;Then re-run /api/files/:id/validate and compute fingerprintWCAGViolation() for the same rule. If the snippet has materially changed (e.g., the alt text was edited, the surrounding HTML was restructured), the override is correctly orphaned — the user should re-mark it. If the snippet looks identical and they still don’t match, the two fingerprint implementations may have drifted; check packages/shared/src/violation-fingerprint.ts against packages/editor/src/data/common-overrides.ts line-for-line.
”Who overrode this finding and why?”
SELECT l.action, l.status, l.actor, l.at, l.justificationFROM file_violation_override_log lJOIN file_violation_overrides o ON o.id = l.override_idWHERE o.file_id = $1 AND o.fingerprint = $2ORDER BY l.at ASC;The log is append-only — every state change is preserved, including revokes. For conformance audits, this is the source of truth.
”The validate endpoint is slow”
The validate endpoint now does two reads: the linter (CPU-bound, fast) plus a Supabase select on file_violation_overrides. The latter is keyed on (file_id) WHERE revoked_at IS NULL and uses the file_violation_overrides_file_idx index. If overrides queries dominate, check that the index is present (\d+ file_violation_overrides in psql) — migrations are idempotent (CREATE INDEX IF NOT EXISTS) but a failed migration leaves the table without it.
”Score on the file dashboard doesn’t match the editor panel”
Expected. The dashboard reads files.accessiblePdfScore (PDF/UA structural conformance, persisted on Refresh PDF). The editor panel reads scoreWithOverrides from the live validate response. They are different scorers measuring different things. The editor panel is more responsive to edits and overrides; the dashboard reflects the last canonical PDF export.