Two console-only tasks (no code) that sit below the existing monthly $500 budget killswitch. Together they convert the killswitch from a single line of defense into three:
Per-key quota β GCP-level cap on requests/minute and requests/day for the API key. Fires before any of our code can see the spike. Works even if Supabase, the cron, or the killswitch SA is broken.
Daily budget β second GCP budget on the same billing account, wired to the same budget-alerts Pub/Sub topic. The existing killswitch detaches billing when this trips, just like the monthly budget. Caps a runaway loop at one dayβs worth of spend instead of one monthβs.
Monthly budget + killswitch β existing $500/mo hard cap (unchanged).
Total time: ~25 min. No code change, no deploy.
Prereqs
You need console access to:
GCP project gemini-theaccessible-org
GCP billing account 0158C6-12C170-5AB1F4
Both are linked from the project console under IAM & Admin β Settings.
Part 1: Per-key quota (~10 min)
GCP lets you cap a single API key by both RPM (requests/minute) and RPD (requests/day). This is the only cap that works without any of our infrastructure being up β it lives in Googleβs edge.
Under API restrictions confirm only Generative Language API is checked. (Defense in depth: stops the key from being used against other Google APIs if it leaks.)
Click Add quota override (or visit APIs & Services β Generative Language API β Quotas).
Override these quotas for this key:
GenerateContent requests per minute per API key β your RPM cap
GenerateContent requests per day per API key β your RPD cap
To prove the quota is actually wired (optional, do this in staging only): set a temporary 1-RPM cap, hit the API twice in quick succession, confirm the second call returns 429 RESOURCE_EXHAUSTED, then restore the real cap.
What it costs you
A misconfigured cap will throttle real traffic. The signal is HTTP 429 from the Gemini API. Both monitors (cost-spike + cost-trickle) will see the drop in traffic, not a spike β so this failure mode is silent to alerting. Mitigation: pick RPM/RPD β₯ 2Γ your observed peak, and re-check the numbers quarterly as traffic grows.
Part 2: Daily budget (~15 min)
A budget is just a notification trigger β it doesnβt gate spending by itself. The killswitch Cloud Function (already deployed, see docs/admin/gcp-budget-killswitch.md) reacts to budget alert messages on the budget-alerts Pub/Sub topic by detaching billing when costAmount / budgetAmount >= 1.0. Weβre adding a second budget pointing at the same topic β the function treats both identically.
Target amount: $25/day (or whatever β₯ 2Γ your normal daily spend per the cost report).
Time range: Daily.
Actions / Thresholds:
50% of actual spend β email only (info).
90% of actual spend β email + Telegram (warning).
100% of actual spend β email + Telegram + detach billing.
The trip levels mirror the monthly budget so the killswitch function behaves identically.
Notifications:
Email alerts to larry@anglin.com.
Connect a Pub/Sub topic for programmatic notifications. Pick projects/gemini-theaccessible-org/topics/budget-alerts (the same topic the monthly budget uses). This is the load-bearing checkbox β without it, the killswitch never sees the alert.
Name it: gemini-theaccessible-org daily $25.
Save.
Verify the daily budget reaches the killswitch
Send a synthetic alert to the topic with budgetDisplayName matching your new daily budget (the same payload the killswitch already handles for the monthly budget):
Once both layers are live, edit docs/admin/gcp-budget-killswitch.md β Trip thresholds section to add:
Threshold
Action
100% of daily budget
Detach billing + email + Telegram page
So future-you isnβt surprised by a mid-month killswitch trip.
Verifying everything together
After both parts are done, your defense layers are:
Layer
Caps at
Resets
Reaction time
Per-key RPM
~600 req/min (example)
1 minute
Instant (HTTP 429)
Per-key RPD
~300k req/day (example)
1 day
Instant (HTTP 429)
Daily budget killswitch
$25/day
Detach until manual re-link
Minutes (GCP billing cadence)
Monthly budget killswitch
$500/month
Detach until manual re-link
Minutes
Trickle monitor
Anomaly detection (page)
n/a
5β10 min
Hourly monitor
Anomaly detection (page)
n/a
1β2 hours
The two monitors page you so you can intervene before the budget caps fire. The budget caps fire if you donβt intervene. The per-key quotas fire if everything else is broken.
Maintenance
Quarterly: re-pull peak RPM / peak daily from the spend graph, re-tune RPM/RPD if traffic has grown >50%.
After any large new workload launches: check the daily budget hasnβt become the binding constraint on legitimate traffic.