Control Daily API Spend
PACE tracks token usage across every agent call and compares the accumulated daily spend against a configurable budget before each scheduled run. When the budget is reached, subsequent cron runs are skipped — without interrupting any currently executing run.
How it works
Cron trigger fires ↓Check daily budget step ├── Reads PACE_DAILY_BUDGET and PACE_DAILY_SPEND variables ├── Resets counter if it's a new calendar day └── Sets budget_exceeded step output ↓budget_exceeded == true? ├── Yes → "Run PACE cycle" step is skipped. No API calls made. └── No → Orchestrator runs normally ↓On exit (success, hold, or abort): └── Writes updated total to PACE_DAILY_SPEND variableThe current run is never interrupted mid-flight. Only the next cron trigger is affected.
Manual runs bypass the cap
workflow_dispatch triggers (runs you start manually from the GitHub Actions UI or the gh CLI) always proceed regardless of the daily budget. The budget check is intentionally bypassed:
[PACE] Manual trigger — budget check bypassed.This means:
- A human explicitly requesting a run will never be silently blocked.
- The cost of that manual run is still tracked and added to
PACE_DAILY_SPEND, so the next scheduled cron will see the updated total.
Setup
1 — Set the daily budget variable
Go to your GitHub repository → Settings → Secrets and variables → Actions → Variables and create:
| Variable name | Value | Notes |
|---|---|---|
PACE_DAILY_BUDGET | 15 | USD limit per calendar day |
Set to 0 or leave unset for unlimited spend (the default — no behaviour change for existing deployments).
2 — No workflow changes needed
The budget-check step is already part of the pace.yml workflow shipped with the framework. It runs automatically before every Run PACE cycle step.
3 — Monitor spend in run logs
Each run prints a per-model breakdown:
[PACE] API usage this run: claude-haiku-4-5-20251001: 45,230 in + 8,912 out = $0.0718 claude-sonnet-4-6: 124,500 in + 31,200 out = $0.8430 Run total: $0.9148[PACE] Daily spend updated: $2.14 (this run: $0.9148)The variables PACE_DAILY_SPEND and PACE_DAILY_SPEND_DATE are maintained automatically — you never need to set them manually.
Reduce per-run cost
Combine the budget cap with the analysis_model setting for maximum savings:
llm: provider: anthropic model: claude-sonnet-4-6 # FORGE + SCRIBE (code generation) analysis_model: claude-haiku-4-5-20251001 # PRIME, GATE, SENTINEL, CONDUITAnalytical agents (PRIME, GATE, SENTINEL, CONDUIT) are single-call with 4k token responses. Switching them to Haiku reduces per-run cost by ~40–50% with no quality impact on analytical tasks.
Estimated daily cost
Typical costs per successful sprint day (one SHIP attempt), including the full pipeline (PRIME + FORGE + GATE + SENTINEL + CONDUIT + SCRIBE):
| Config | Estimated cost | Notes |
|---|---|---|
| All Sonnet, 1 attempt | $2–4 | |
| All Sonnet, 2 attempts (1 retry) | $4–8 | Retry compounds FORGE cost |
| Sonnet (FORGE) + Haiku (analysis), 1 attempt | $1.50–3.00 | Recommended |
| All Haiku | $0.15–0.50 | Not recommended — FORGE quality degrades |
What “one story” includes
The cost estimate above covers one complete story delivery: code written, tests run and passing, security reviewed, CI/CD reviewed, and context documents updated. In product terms, that is typically:
- A new API endpoint with tests and documentation
- A domain model with a repository interface and in-memory implementation
- A service class with business logic, validation, and unit tests
- A refactored module with updated tests
For comparison: a mid-level engineer takes 2–4 hours to implement, test, and review an equivalent story. At a fully-loaded cost of $75–120/hour, that is $150–480 of engineering time. A PACE story on Sonnet + Haiku costs $1.50–3.00.
Per-agent cost breakdown
For a story using Sonnet for FORGE and Haiku for all other agents:
| Agent | Model | Typical cost |
|---|---|---|
| PRIME | Haiku | $0.003–0.006 |
| FORGE | Sonnet | $0.80–2.50 |
| GATE | Haiku | $0.003–0.008 |
| SENTINEL | Haiku | $0.004–0.009 |
| CONDUIT | Haiku | $0.003–0.006 |
| SCRIBE | Haiku | $0.005–0.013 |
| Total | Mixed | $0.82–2.54 |
FORGE accounts for 85–95% of the cost. Everything else combined is under $0.05. This is why the analysis_model split pays for itself immediately.
Tracking wasted spend on retries
PROGRESS.md is the only AI framework cost report that breaks down successful spend vs retry waste:
| Story | Est. Cost | Actual Cost || story-1 | ~$0.45 | $1.82 || story-2 | ~$0.80 | $6.59 (2×) | ← two attempts| story-3 | ~$0.55 | $2.11 |
Cost Summary Total estimated: $1.80 Total actual (incl. retries): $10.52 Wasted on retries: $6.59 FORGE-only (all stories): $8.90The (Nx) suffix means the story required N pipeline attempts before shipping. Wasted on retries is the total spend on attempts that did not ship. If this number is consistently high, the acceptance criteria in plan.yaml may need tightening, or forge.max_iterations may need increasing for complex stories.
The security gate pays for itself
SENTINEL runs on every story for ~$0.004–0.009 (Haiku). Consider what it catches:
- A hardcoded API key committed to source costs hours to rotate across all environments and may require a security incident report
- A SQL injection vulnerability discovered post-deploy requires an emergency patch, potential breach investigation, and customer notification
- Missing authentication on an admin endpoint, if exploited, costs significantly more than the sprint
Catching one SENTINEL HOLD prevents remediation work that commonly runs 4–10× more expensive than the original feature. The mandatory security gate is not overhead — it is the cheapest form of security review available.
Advisory lifecycle as cost control
SENTINEL and CONDUIT advisories are non-blocking by design. This is a deliberate cost decision: blocking on every advisory would cause FORGE to retry stories repeatedly for findings that do not affect shipping safety. Instead:
- Advisories accumulate in
.pace/advisory_backlog.yaml - On designated clearance stories (every 7th story by default), all advisories must be resolved
- Clearance stories typically cost less than a feature story — FORGE is making targeted fixes, not building from scratch
This batching reduces the per-story retry cost of advisory remediation by ~70% compared to blocking on every advisory immediately.
A PACE_DAILY_BUDGET of $15–25 comfortably covers 4× daily cron runs on normal days while blocking runaway spend if the pipeline enters a retry loop.
Day rollover
PACE_DAILY_SPEND resets automatically at the start of the first cron run on a new calendar day — no manual intervention needed. The budget-check step compares PACE_DAILY_SPEND_DATE to today’s date (in the configured timezone) and resets the counter if they differ.
The rollover timezone is controlled by the PACE_REPORTER_TIMEZONE repository variable (IANA format, e.g. Asia/Kolkata, America/New_York). It defaults to UTC if unset. Set it to match the reporter.timezone field in your pace.config.yaml so the budget day aligns with your team’s calendar day rather than UTC midnight.
Troubleshooting
“Budget check step fails with permission error”
The GH_TOKEN in the budget-check step needs repo scope (or variables:write fine-grained permission) to call gh variable set. GITHUB_TOKEN with the default Actions permission is sufficient for most repositories.
“Spend is not being tracked”
Confirm python pace/orchestrator.py exits normally — atexit handlers are not called on SIGKILL. Forcibly terminated runs won’t update the counter (conservative — it under-counts rather than over-counts).
“I want to reset the counter manually”
Set PACE_DAILY_SPEND to 0 in the repository variables. The counter will restart from zero on the next run.