Skip to content

Switch LLM Provider

PACE ships with two LLM adapters:

  • anthropic — direct Anthropic SDK integration (default)
  • litellm — routes to 100+ providers via LiteLLM

Switching to a different Anthropic model

pace/pace.config.yaml
llm:
provider: anthropic
model: claude-opus-4-6 # or claude-haiku-4-5-20251001

Set your key:

Terminal window
export ANTHROPIC_API_KEY="sk-ant-..."

Switching to OpenAI (GPT-4o)

llm:
provider: litellm
model: openai/gpt-4o
Terminal window
export LLM_API_KEY="sk-..."
pip install litellm

Switching to Google Gemini

llm:
provider: litellm
model: gemini/gemini-2.0-flash
Terminal window
export LLM_API_KEY="AIza..."
pip install litellm

Switching to AWS Bedrock

llm:
provider: litellm
model: bedrock/anthropic.claude-sonnet-4-6
Terminal window
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_REGION_NAME="us-east-1"
pip install litellm boto3

No LLM_API_KEY needed — LiteLLM uses your AWS credentials directly.

Switching to Azure OpenAI

llm:
provider: litellm
model: azure/gpt-4o
Terminal window
export LLM_API_KEY="..." # Azure API key
export AZURE_API_BASE="https://my-deployment.openai.azure.com"
export AZURE_API_VERSION="2024-02-01"
pip install litellm

Switching to local Ollama

llm:
provider: litellm
model: ollama/llama3.1
base_url: "http://localhost:11434"

No API key needed. Start Ollama before running PACE:

Terminal window
ollama serve
ollama pull llama3.1
pip install litellm

Switching to Groq

llm:
provider: litellm
model: groq/llama-3.1-70b-versatile
Terminal window
export LLM_API_KEY="gsk_..."
pip install litellm

Switching to Mistral

llm:
provider: litellm
model: mistral/mistral-large-latest
Terminal window
export LLM_API_KEY="..."
pip install litellm

Provider comparison

ProviderBest forTool callingSpeed
anthropic/claude-sonnet-4-6Best quality + codeNativeMedium
anthropic/claude-opus-4-6Complex reasoningNativeSlow
anthropic/claude-haiku-4-5-20251001Speed + costNativeFast
openai/gpt-4oGPT ecosystemNativeFast
gemini/gemini-2.0-flashCost-efficientNativeFast
ollama/llama3.1Fully local / privateYesVaries
groq/llama-3.1-70b-versatileFastest inferenceYesVery fast

Model quality guidance

PACE agents spend most tokens on FORGE (implementation) and SCRIBE (documentation). These agents run a multi-turn tool loop and benefit most from a capable model. GATE, SENTINEL, and CONDUIT make single-pass structured YAML calls and are less sensitive to model quality.

For cost-sensitive setups, you can use a powerful model for FORGE/SCRIBE and a cheaper model for the review agents by running them with different configs — though this requires code changes to the orchestrator.