PDF Page Rendering
Background
Several pipeline stages (OpenAI Vision, MathPix per-page, equation audit, segmented pipeline) need rasterised PNG images of individual PDF pages before sending them to external APIs.
Previously this was done with unpdf’s renderPageAsImage, which internally used @napi-rs/canvas — a native Node.js addon that ships .node binaries. This works fine in Node but is incompatible with Cloudflare Workers: esbuild cannot bundle native addons, and the Workers runtime has no dlopen.
Current approach: Browser Rendering
We now use the Cloudflare Browser Rendering binding (Puppeteer-compatible headless Chromium). The utility at workers/api/src/utils/pdf-to-png.ts:
- Encodes the single-page PDF as base64.
- Launches a headless browser via the existing
BrowserProviderinterface. - Loads an HTML page that uses pdf.js (from
cdnjs.cloudflare.com) to paint the PDF onto a<canvas>. - Takes a full-page PNG screenshot.
- Returns the PNG bytes.
Trade-offs
@napi-rs/canvas | Browser Rendering | |
|---|---|---|
| Runtime | Node.js only | Cloudflare Workers |
| Latency | ~200 ms per page | ~1–3 s per page (cold browser) |
| Fidelity | Good (Cairo-based) | Excellent (full Chromium renderer) |
| Dependency | Native .node binary | Workers binding + pdf.js CDN |
| Scale | Limited by single process | Cloudflare-managed browser pool |
Browser Rendering adds latency per page but gives higher fidelity (Chromium renders fonts, SVGs, and complex PDF features more accurately than Cairo). The browser pool is managed by Cloudflare so there is no server to maintain.
Future options
unpdfWorkers support: Theunpdflibrary may add a Workers-compatible canvas backend in the future. Monitor https://github.com/nicolo-ribaudo/unpdf.- External rendering service: A dedicated microservice running Node.js with
@napi-rs/canvasor Puppeteer could serve PNG renders via HTTP, decoupling the dependency from the Workers runtime. - Pre-rendered images: For known documents, pages could be pre-rendered at upload time and stored in R2, avoiding runtime rendering entirely.