Providers — ReplayCI
ReplayCI supports multiple LLM providers through a unified interface. Write your contracts once, run them against any provider — same YAML, same assertions, same golden fixtures.
Supported providers
| Provider | Flag value | API |
|---|---|---|
| OpenAI | openai | Chat Completions (/v1/chat/completions) |
| Anthropic | anthropic | Messages (/v1/messages) |
| Recorded | recorded | Offline — reads from fixture files |
Set the provider in .replayci.yml or via CLI flag:
# .replayci.yml
provider: openai
model: gpt-4o-mini
# Or override with flags
npx replayci --provider anthropic --model claude-sonnet-4-6
How provider abstraction works
Each provider adapter translates between the ReplayCI contract format and the provider's native API. This happens transparently — your contracts never reference provider-specific fields.
What the adapters handle for you:
- Tool definitions — OpenAI uses
{ type: "function", function: { parameters } }, Anthropic uses{ input_schema }. You write tools once; the adapter translates. - System messages — OpenAI accepts
role: "system"in the messages array. Anthropic expects a separatesystemfield. The adapter extracts and routes it correctly. - Tool call responses — OpenAI returns
tool_calls[].function.argumentsas a JSON string. Anthropic returnstool_usecontent blocks withinputas an object. Both get normalized to{ id, name, arguments }before your assertions run, whereargumentsis stored as a JSON string in the normalized response shape. When you use nested paths like$.tool_calls[0].arguments.location, ReplayCI auto-parses that string during path traversal. - Tool choice —
"auto","required","none", or a specific tool name. Each maps to the provider's native format.
The result: Your contract assertions like $.tool_calls[0].name work identically regardless of which provider executed the request.
API key setup
ReplayCI uses a single environment variable for all providers:
export REPLAYCI_PROVIDER_KEY="sk-..."
This key is passed to whichever provider you select. For OpenAI, it's your OpenAI API key. For Anthropic, it's your Anthropic API key.
The recorded provider doesn't need an API key.
Model resolution
ReplayCI resolves model names through a registry before calling the provider API. This gives you short aliases, family-based request profiles, and validation — without changing how you write contracts.
How it works
Resolution is a 3-step process:
- Alias lookup — short names like
5.2oropus-4expand to full model IDs (gpt-5.2,claude-opus-4-20250514) - Family matching — the resolved ID is tested against family prefix patterns (e.g.,
^gpt-5,^claude-). The first matching family provides the request profile. - Passthrough — if no family matches, the model ID is sent as-is with provider defaults. A warning is emitted to stderr.
The resolved raw model ID is always what gets sent to the provider API and recorded in baseline keys. Aliases are a convenience layer — they never appear in run artifacts.
Aliases
Aliases are short names that expand to full model IDs:
| Alias | Resolves to | Provider |
|---|---|---|
opus-4 | claude-opus-4-20250514 | Anthropic |
sonnet-4.6 | claude-sonnet-4-6 | Anthropic |
sonnet-4.5 | claude-sonnet-4-5-20250929 | Anthropic |
haiku-4.5 | claude-haiku-4-5-20251001 | Anthropic |
5.2 | gpt-5.2 | OpenAI |
5 | gpt-5 | OpenAI |
5-mini | gpt-5-mini | OpenAI |
4.1 | gpt-4.1 | OpenAI |
4.1-mini | gpt-4.1-mini | OpenAI |
4.1-nano | gpt-4.1-nano | OpenAI |
4o | gpt-4o | OpenAI |
4o-mini | gpt-4o-mini | OpenAI |
Aliases are case-insensitive and whitespace-trimmed. You can use them anywhere a model ID is accepted:
npx replayci --provider openai --model 4o-mini
npx replayci --provider anthropic --model opus-4
Families and request profiles
Each model family has a regex pattern and a RequestProfile that controls how the provider adapter builds API requests:
| Family | Pattern | Token field | Temperature | API format |
|---|---|---|---|---|
| GPT-5 | ^gpt-5 | max_completion_tokens | yes | chat_completions |
| GPT-4 | ^gpt-4 | max_tokens | yes | chat_completions |
| O-series | ^o[1-9] | max_completion_tokens | no | chat_completions |
| Claude | ^claude- | max_tokens | yes | messages |
The token_field determines whether the adapter sends max_tokens or max_completion_tokens in the request body. The supports_temperature flag controls whether a temperature parameter is included — reasoning models (O-series) don't support it. The required_headers field adds provider-specific headers (e.g., anthropic-version: 2023-06-01 for Claude).
This means you don't need to remember which parameter each model family expects — the resolver handles it based on the registry.
Passthrough behavior
Models not matching any family pattern still work — they pass through to the provider with sensible defaults:
- OpenAI default:
max_tokens, temperature enabled,chat_completionsformat - Anthropic default:
max_tokens, temperature enabled,messagesformat withanthropic-versionheader
A warning is printed to stderr:
⚠ Model "my-custom-model" not found in registry; using openai defaults
Use --strict-model to turn this warning into a hard error:
npx replayci --provider openai --model unknown-model --strict-model
# Error: Model "unknown-model" has no matching family for provider openai
Inspecting resolution
Use --resolve to see how a model name resolves without running contracts:
npx replayci --provider openai --model 5.2 --resolve
Use replayci models --provider openai to list all registered families and aliases for a provider. See the CLI Reference for full details.
The recorded provider
The recorded provider is the key to fast, free, deterministic testing. Instead of calling a live API, it reads responses from recording files stored alongside your fixtures.
Why use it
- No API costs — recorded responses are free to replay
- Deterministic — same input always produces the same output
- Fast — no network latency, runs in milliseconds
- Offline — works without internet access
- CI-friendly — no API keys needed in your CI environment for recorded tests
How it works
- You run your contracts against a live provider with
--capture-recordings - ReplayCI saves each response as a
.recording.jsonfile - Later, you run with
--provider recordedand ReplayCI reads those files instead of calling the API
# Step 1: Capture responses from a live provider
npx replayci --provider openai --model gpt-4o-mini --capture-recordings
# Step 2: Run offline using captured responses
npx replayci --provider recorded
Recording file location
Recording files live in a recordings/ directory next to your golden/ directory:
packs/my-pack/
golden/
tool_call.success.json # your fixture
recordings/
tool_call.success.recording.json # captured response
The naming convention is automatic: foo.json → foo.recording.json in the recordings/ directory.
What's in a recording file
A recording captures everything needed to replay a response:
{
"schema_version": "1.0",
"boundary": {
"version": "1.0",
"provider": "openai",
"model_id": "gpt-4o-mini",
"tool_schema_hash": "a3f82c91b4d7e6f0",
"tool_choice_mode": "auto",
"system_prompt_hash": "7b2d4f1e8a9c3d56",
"messages_hash": "c5e8f2a1d6b94370"
},
"response": {
"success": true,
"tool_calls": [
{
"id": "call_abc123",
"name": "get_weather",
"arguments": "{\"location\": \"San Francisco, CA\"}"
}
],
"content": null,
"error": null
},
"metadata": {
"recorded_at": "2026-03-01T10:00:00Z",
"original_latency_ms": 150,
"model_version": "gpt-4o-mini-2024-07-18",
"usage": {
"prompt_tokens": 95,
"completion_tokens": 22,
"total_tokens": 117
}
}
}
Boundary validation
When the recorded provider loads a recording, it validates the boundary — a set of hashes that ensure the recording matches the current fixture:
| Field | What it checks |
|---|---|
provider | Must match the original provider |
tool_schema_hash | SHA-256 of your tool definitions (sorted, canonical) |
messages_hash | SHA-256 of your messages array |
tool_choice_mode | Must match the original tool choice setting |
If your tool definitions or messages change after capturing a recording, the hashes won't match. ReplayCI flags this as a NonReproducible result with a clear reason code:
- SCHEMA_DRIFT — tool definitions changed since the recording was captured
- NON_DETERMINISTIC_INPUT — messages or tool choice changed
When this happens, re-capture your recordings:
npx replayci --provider openai --model gpt-4o-mini --capture-recordings
Switching providers
Same contracts, different provider — just change the flag:
# Test against OpenAI
npx replayci --provider openai --model gpt-4o-mini
# Same contracts against Anthropic
npx replayci --provider anthropic --model claude-sonnet-4-6
# Offline with recorded fixtures
npx replayci --provider recorded
You can capture recordings from each provider separately and test them offline:
# Capture OpenAI responses
npx replayci --provider openai --model gpt-4o-mini --capture-recordings \
--pack packs/openai-v0.1
# Capture Anthropic responses
npx replayci --provider anthropic --model claude-sonnet-4-6 --capture-recordings \
--pack packs/anthropic-v0.1
Shadow mode
Shadow mode lets you compare two providers side-by-side without affecting your primary results:
npx replayci --provider openai --model gpt-4o-mini \
--shadow-capture \
--shadow-provider anthropic --shadow-model claude-sonnet-4-6
The primary provider (OpenAI) determines pass/fail. The shadow provider (Anthropic) runs in parallel for comparison only — its results are captured but never affect your CI gate.
NeverNormalize
Each pack contains a NeverNormalize.json file that protects semantically important fields from being normalized away during baseline comparison:
{
"schema_version": "1.0",
"pack_id": "my-pack",
"fields": [
"tool_calls[].name",
"tool_calls[].arguments",
"content",
"success"
]
}
These four fields should always be listed — they're the core semantic content of every response. Add any additional fields that are important to your specific use case (e.g., specific argument paths).
Do not list volatile fields like timestamps or request IDs — those should be normalized for stable fingerprints.
Every pack directory must contain this file, even if you're just starting out. The starter pack includes one with sensible defaults.
Next steps
- Write contracts: See Writing Tests for the full YAML format
- Set up CI: See CI Integration for GitHub Actions and GitLab CI examples
- Debug issues: See Troubleshooting for common problems and solutions