Skip to content

Training Data Export

PACE can record every story run as a structured trace and export it as a JSONL fine-tuning dataset. This lets you fine-tune a smaller model on your own sprint history — improving FORGE’s code quality for your specific codebase and conventions.

How it works

When training.enabled: true, the DataExportHook captures every FORGE story run as a StoryTrace:

  • The story card (acceptance criteria, scope)
  • Every tool call FORGE made (read, write, run tests)
  • The final handoff note
  • GATE, SENTINEL, and CONDUIT verdicts
  • A reward score computed from the pipeline outcome

At the end of each run, traces above training.min_reward are appended to the export file.


Reward score formula

The reward score (0.0–1.0) weights pipeline outcomes:

ComponentWeightHow scored
GATE decision0.40SHIP = 1.0, HOLD = 0.0
Criteria coverage0.25Fraction of acceptance criteria marked PASS
SENTINEL decision0.20SHIP = 1.0, ADVISORY = 0.7, HOLD = 0.0
Iteration count0.15Penalised for high iteration counts: max(0, 1 - iterations / max_iter)

Traces with reward < training.min_reward are excluded from the export (default threshold: 0.6).


Enable training data collection

training:
enabled: true
export_dir: ".pace/training"
format: sft
min_reward: 0.6

The export directory is created automatically. Add it to .gitignore if you do not want training data committed to source control.


Export formats

sft — Supervised fine-tuning

Produces sft_export.jsonl. Each line is a JSON object with messages in the standard chat format:

{
"messages": [
{"role": "system", "content": "You are FORGE ..."},
{"role": "user", "content": "Story card: ..."},
{"role": "assistant", "content": "<tool_call>read_file ...</tool_call>"},
...
{"role": "assistant", "content": "<tool_call>complete_handoff ...</tool_call>"}
]
}

Use this format for standard instruction fine-tuning with OpenAI, Together AI, Fireworks, or Axolotl.

reward — Reward modelling

Produces reward_export.jsonl. Each line contains the prompt, completion, and numeric reward score:

{
"prompt": "Story card: ...",
"completion": "<full FORGE turn>",
"reward": 0.87
}

Use this format for RLHF/DPO pipelines (TRL, OpenRLHF).


Exporting manually

To export all collected traces at any time:

Terminal window
cd pace
python -c "
from training.exporter import export_traces
export_traces(format='sft', min_reward=0.6)
"

Output is written to training.export_dir (default: .pace/training/).


Filtering and inspecting traces

Terminal window
# Count traces by reward band
python -c "
import json, pathlib
traces = [json.loads(l) for l in pathlib.Path('.pace/training/sft_export.jsonl').read_text().splitlines()]
print(f'Total: {len(traces)}')
high = [t for t in traces if t.get('reward', 1) >= 0.8]
print(f'High quality (>=0.8): {len(high)}')
"

Privacy considerations

Training traces contain your source code, acceptance criteria, and test output. Before exporting:

  • Confirm that traces do not contain credentials, PII, or proprietary data.
  • Apply training.min_reward to exclude failed runs that may contain error messages with sensitive stack traces.
  • Review the export directory before uploading to any external fine-tuning service.