Control Daily API Spend

PACE tracks token usage across every agent call and compares the accumulated daily spend against a configurable budget before each scheduled run. When the budget is reached, subsequent cron runs are skipped — without interrupting any currently executing run.

How it works

Cron trigger fires
      ↓
Check daily budget step
  ├── Reads PACE_DAILY_BUDGET and PACE_DAILY_SPEND variables
  ├── Resets counter if it's a new calendar day
  └── Sets budget_exceeded step output
      ↓
budget_exceeded == true?
  ├── Yes → "Run PACE cycle" step is skipped. No API calls made.
  └── No  → Orchestrator runs normally
      ↓
On exit (success, hold, or abort):
  └── Writes updated total to PACE_DAILY_SPEND variable

The current run is never interrupted mid-flight. Only the next cron trigger is affected.

Manual runs bypass the cap

workflow_dispatch triggers (runs you start manually from the GitHub Actions UI or the gh CLI) always proceed regardless of the daily budget. The budget check is intentionally bypassed:

[PACE] Manual trigger — budget check bypassed.

This means:

A human explicitly requesting a run will never be silently blocked.
The cost of that manual run is still tracked and added to PACE_DAILY_SPEND, so the next scheduled cron will see the updated total.

Setup

1 — Set the daily budget variable

Go to your GitHub repository → Settings → Secrets and variables → Actions → Variables and create:

Variable name	Value	Notes
`PACE_DAILY_BUDGET`	`15`	USD limit per calendar day

Set to 0 or leave unset for unlimited spend (the default — no behaviour change for existing deployments).

2 — No workflow changes needed

The budget-check step is already part of the pace.yml workflow shipped with the framework. It runs automatically before every Run PACE cycle step.

3 — Monitor spend in run logs

Each run prints a per-model breakdown. When the Anthropic adapter is in use, cache token columns are shown separately with their correct pricing applied:

[PACE] API usage this run:
  claude-haiku-4-5-20251001: 45,230 in + 8,912 out = $0.0718
  claude-sonnet-4-6: 8,120 in + 31,200 out + 112,500 cache_read + 14,200 cache_create = $0.6401
  Run total: $0.7119
  Cache savings this run: $0.3038 vs uncached
[PACE] Daily spend updated: $2.14 (this run: $0.7119)

The in column shows non-cached input tokens only. cache_read tokens are priced at 10% of the input rate; cache_create tokens at 125%. The “Cache savings” line is the difference between what those read tokens would have cost at full price versus the actual cache price — visible evidence of what caching saves per run.

The variables PACE_DAILY_SPEND and PACE_DAILY_SPEND_DATE are maintained automatically — you never need to set them manually.

Reduce per-run cost

Combine the budget cap with the analysis_model setting for maximum savings:

llm:
  provider: anthropic
  model: claude-sonnet-4-6           # FORGE + SCRIBE (code generation)
  analysis_model: claude-haiku-4-5-20251001  # PRIME, GATE, SENTINEL, CONDUIT

Analytical agents (PRIME, GATE, SENTINEL, CONDUIT) are single-call with 4k token responses. Switching them to Haiku reduces per-run cost by ~40–50% with no quality impact on analytical tasks.

Estimated daily cost

Typical costs per successful sprint day (one SHIP attempt), including the full pipeline (PRIME + FORGE + GATE + SENTINEL + CONDUIT + SCRIBE):

Config	Estimated cost	Notes
All Sonnet, 1 attempt	$2–4
All Sonnet, 2 attempts (1 retry)	$4–8	Retry compounds FORGE cost
Sonnet (FORGE) + Haiku (analysis), 1 attempt	$1.50–3.00	Recommended
All Haiku	$0.15–0.50	Not recommended — FORGE quality degrades

What “one story” includes

The cost estimate above covers one complete story delivery: code written, tests run and passing, security reviewed, CI/CD reviewed, and context documents updated. In product terms, that is typically:

A new API endpoint with tests and documentation
A domain model with a repository interface and in-memory implementation
A service class with business logic, validation, and unit tests
A refactored module with updated tests

For comparison: a mid-level engineer takes 2–4 hours to implement, test, and review an equivalent story. At a fully-loaded cost of $75–120/hour, that is $150–480 of engineering time. A PACE story on Sonnet + Haiku costs $1.50–3.00.

Per-agent cost breakdown

For a story using Sonnet for FORGE and Haiku for all other agents:

Agent	Model	Typical cost
PRIME	Haiku	$0.003–0.006
FORGE	Sonnet	$0.80–2.50
GATE	Haiku	$0.003–0.008
SENTINEL	Haiku	$0.004–0.009
CONDUIT	Haiku	$0.003–0.006
SCRIBE	Haiku	$0.005–0.013
Total	Mixed	$0.82–2.54

FORGE accounts for 85–95% of the cost. Everything else combined is under $0.05. This is why the analysis_model split pays for itself immediately.

Tracking wasted spend on retries

PROGRESS.md is the only AI framework cost report that breaks down successful spend vs retry waste:

| Story   | Est. Cost | Actual Cost  |
| story-1 | ~$0.45    | $1.82        |
| story-2 | ~$0.80    | $6.59 (2×)  |  ← two attempts
| story-3 | ~$0.55    | $2.11        |

Cost Summary
  Total estimated:              $1.80
  Total actual (incl. retries): $10.52
  Wasted on retries:            $6.59
  FORGE-only (all stories):     $8.90

The (Nx) suffix means the story required N pipeline attempts before shipping. Wasted on retries is the total spend on attempts that did not ship. If this number is consistently high, the acceptance criteria in plan.yaml may need tightening, or forge.max_iterations may need increasing for complex stories.

The security gate pays for itself

SENTINEL runs on every story for ~$0.004–0.009 (Haiku). Consider what it catches:

A hardcoded API key committed to source costs hours to rotate across all environments and may require a security incident report
A SQL injection vulnerability discovered post-deploy requires an emergency patch, potential breach investigation, and customer notification
Missing authentication on an admin endpoint, if exploited, costs significantly more than the sprint

Catching one SENTINEL HOLD prevents remediation work that commonly runs 4–10× more expensive than the original feature. The mandatory security gate is not overhead — it is the cheapest form of security review available.

Advisory lifecycle as cost control

SENTINEL and CONDUIT advisories are non-blocking by design. This is a deliberate cost decision: blocking on every advisory would cause FORGE to retry stories repeatedly for findings that do not affect shipping safety. Instead:

Advisories accumulate in .pace/advisory_backlog.yaml
On designated clearance stories (every 7th story by default), all advisories must be resolved
Clearance stories typically cost less than a feature story — FORGE is making targeted fixes, not building from scratch

This batching reduces the per-story retry cost of advisory remediation by ~70% compared to blocking on every advisory immediately.

A PACE_DAILY_BUDGET of $15–25 comfortably covers 4× daily cron runs on normal days while blocking runaway spend if the pipeline enters a retry loop.

Day rollover

PACE_DAILY_SPEND resets automatically at the start of the first cron run on a new calendar day — no manual intervention needed. The budget-check step compares PACE_DAILY_SPEND_DATE to today’s date (in the configured timezone) and resets the counter if they differ.

The rollover timezone is controlled by the PACE_REPORTER_TIMEZONE repository variable (IANA format, e.g. Asia/Kolkata, America/New_York). It defaults to UTC if unset. Set it to match the reporter.timezone field in your pace.config.yaml so the budget day aligns with your team’s calendar day rather than UTC midnight.

Troubleshooting

“Budget check step fails with permission error” The GH_TOKEN in the budget-check step needs repo scope (or variables:write fine-grained permission) to call gh variable set. GITHUB_TOKEN with the default Actions permission is sufficient for most repositories.

“Spend is not being tracked” Confirm python pace/orchestrator.py exits normally — atexit handlers are not called on SIGKILL. Forcibly terminated runs won’t update the counter (conservative — it under-counts rather than over-counts).

“I want to reset the counter manually” Set PACE_DAILY_SPEND to 0 in the repository variables. The counter will restart from zero on the next run.