Providers — ReplayCI

ReplayCI supports multiple LLM providers through a unified interface. Write your contracts once, run them against any provider — same YAML, same assertions, same golden fixtures.

Supported providers

Provider	Flag value	API
OpenAI	`openai`	Chat Completions (`/v1/chat/completions`)
Anthropic	`anthropic`	Messages (`/v1/messages`)
Recorded	`recorded`	Offline — reads from fixture files

Set the provider in .replayci.yml or via CLI flag:

# .replayci.yml
provider: openai
model: gpt-4o-mini

# Or override with flags
npx replayci --provider anthropic --model claude-sonnet-4-6

How provider abstraction works

Each provider adapter translates between the ReplayCI contract format and the provider's native API. This happens transparently — your contracts never reference provider-specific fields.

What the adapters handle for you:

Tool definitions — OpenAI uses { type: "function", function: { parameters } }, Anthropic uses { input_schema }. You write tools once; the adapter translates.
System messages — OpenAI accepts role: "system" in the messages array. Anthropic expects a separate system field. The adapter extracts and routes it correctly.
Tool call responses — OpenAI returns tool_calls[].function.arguments as a JSON string. Anthropic returns tool_use content blocks with input as an object. Both get normalized to { id, name, arguments } before your assertions run, where arguments is stored as a JSON string in the normalized response shape. When you use nested paths like $.tool_calls[0].arguments.location, ReplayCI auto-parses that string during path traversal.
Tool choice — "auto", "required", "none", or a specific tool name. Each maps to the provider's native format.

The result: Your contract assertions like $.tool_calls[0].name work identically regardless of which provider executed the request.

API key setup

ReplayCI uses a single environment variable for all providers:

export REPLAYCI_PROVIDER_KEY="sk-..."

This key is passed to whichever provider you select. For OpenAI, it's your OpenAI API key. For Anthropic, it's your Anthropic API key.

The recorded provider doesn't need an API key.

Model resolution

ReplayCI resolves model names through a registry before calling the provider API. This gives you short aliases, family-based request profiles, and validation — without changing how you write contracts.

How it works

Resolution is a 3-step process:

Alias lookup — short names like 5.2 or opus-4 expand to full model IDs (gpt-5.2, claude-opus-4-20250514)
Family matching — the resolved ID is tested against family prefix patterns (e.g., ^gpt-5, ^claude-). The first matching family provides the request profile.
Passthrough — if no family matches, the model ID is sent as-is with provider defaults. A warning is emitted to stderr.

The resolved raw model ID is always what gets sent to the provider API and recorded in baseline keys. Aliases are a convenience layer — they never appear in run artifacts.

Aliases

Aliases are short names that expand to full model IDs:

Alias	Resolves to	Provider
`opus-4`	`claude-opus-4-20250514`	Anthropic
`sonnet-4.6`	`claude-sonnet-4-6`	Anthropic
`sonnet-4.5`	`claude-sonnet-4-5-20250929`	Anthropic
`haiku-4.5`	`claude-haiku-4-5-20251001`	Anthropic
`5.2`	`gpt-5.2`	OpenAI
`5`	`gpt-5`	OpenAI
`5-mini`	`gpt-5-mini`	OpenAI
`4.1`	`gpt-4.1`	OpenAI
`4.1-mini`	`gpt-4.1-mini`	OpenAI
`4.1-nano`	`gpt-4.1-nano`	OpenAI
`4o`	`gpt-4o`	OpenAI
`4o-mini`	`gpt-4o-mini`	OpenAI

Aliases are case-insensitive and whitespace-trimmed. You can use them anywhere a model ID is accepted:

npx replayci --provider openai --model 4o-mini
npx replayci --provider anthropic --model opus-4

Families and request profiles

Each model family has a regex pattern and a RequestProfile that controls how the provider adapter builds API requests:

Family	Pattern	Token field	Temperature	API format
GPT-5	`^gpt-5`	`max_completion_tokens`	yes	`chat_completions`
GPT-4	`^gpt-4`	`max_tokens`	yes	`chat_completions`
O-series	`^o[1-9]`	`max_completion_tokens`	no	`chat_completions`
Claude	`^claude-`	`max_tokens`	yes	`messages`

The token_field determines whether the adapter sends max_tokens or max_completion_tokens in the request body. The supports_temperature flag controls whether a temperature parameter is included — reasoning models (O-series) don't support it. The required_headers field adds provider-specific headers (e.g., anthropic-version: 2023-06-01 for Claude).

This means you don't need to remember which parameter each model family expects — the resolver handles it based on the registry.

Passthrough behavior

Models not matching any family pattern still work — they pass through to the provider with sensible defaults:

OpenAI default: max_tokens, temperature enabled, chat_completions format
Anthropic default: max_tokens, temperature enabled, messages format with anthropic-version header

A warning is printed to stderr:

⚠ Model "my-custom-model" not found in registry; using openai defaults

Use --strict-model to turn this warning into a hard error:

npx replayci --provider openai --model unknown-model --strict-model
# Error: Model "unknown-model" has no matching family for provider openai

Inspecting resolution

Use --resolve to see how a model name resolves without running contracts:

npx replayci --provider openai --model 5.2 --resolve

Use replayci models --provider openai to list all registered families and aliases for a provider. See the CLI Reference for full details.

The recorded provider

The recorded provider is the key to fast, free, deterministic testing. Instead of calling a live API, it reads responses from recording files stored alongside your fixtures.

Why use it

No API costs — recorded responses are free to replay
Deterministic — same input always produces the same output
Fast — no network latency, runs in milliseconds
Offline — works without internet access
CI-friendly — no API keys needed in your CI environment for recorded tests

How it works

You run your contracts against a live provider with --capture-recordings
ReplayCI saves each response as a .recording.json file
Later, you run with --provider recorded and ReplayCI reads those files instead of calling the API

# Step 1: Capture responses from a live provider
npx replayci --provider openai --model gpt-4o-mini --capture-recordings

# Step 2: Run offline using captured responses
npx replayci --provider recorded

Recording file location

Recording files live in a recordings/ directory next to your golden/ directory:

packs/my-pack/
  golden/
    tool_call.success.json          # your fixture
  recordings/
    tool_call.success.recording.json  # captured response

The naming convention is automatic: foo.json → foo.recording.json in the recordings/ directory.

What's in a recording file

A recording captures everything needed to replay a response:

{
  "schema_version": "1.0",
  "boundary": {
    "version": "1.0",
    "provider": "openai",
    "model_id": "gpt-4o-mini",
    "tool_schema_hash": "a3f82c91b4d7e6f0",
    "tool_choice_mode": "auto",
    "system_prompt_hash": "7b2d4f1e8a9c3d56",
    "messages_hash": "c5e8f2a1d6b94370"
  },
  "response": {
    "success": true,
    "tool_calls": [
      {
        "id": "call_abc123",
        "name": "get_weather",
        "arguments": "{\"location\": \"San Francisco, CA\"}"
      }
    ],
    "content": null,
    "error": null
  },
  "metadata": {
    "recorded_at": "2026-03-01T10:00:00Z",
    "original_latency_ms": 150,
    "model_version": "gpt-4o-mini-2024-07-18",
    "usage": {
      "prompt_tokens": 95,
      "completion_tokens": 22,
      "total_tokens": 117
    }
  }
}

Boundary validation

When the recorded provider loads a recording, it validates the boundary — a set of hashes that ensure the recording matches the current fixture:

Field	What it checks
`provider`	Must match the original provider
`tool_schema_hash`	SHA-256 of your tool definitions (sorted, canonical)
`messages_hash`	SHA-256 of your messages array
`tool_choice_mode`	Must match the original tool choice setting

If your tool definitions or messages change after capturing a recording, the hashes won't match. ReplayCI flags this as a NonReproducible result with a clear reason code:

SCHEMA_DRIFT — tool definitions changed since the recording was captured
NON_DETERMINISTIC_INPUT — messages or tool choice changed

When this happens, re-capture your recordings:

npx replayci --provider openai --model gpt-4o-mini --capture-recordings

Switching providers

Same contracts, different provider — just change the flag:

# Test against OpenAI
npx replayci --provider openai --model gpt-4o-mini

# Same contracts against Anthropic
npx replayci --provider anthropic --model claude-sonnet-4-6

# Offline with recorded fixtures
npx replayci --provider recorded

You can capture recordings from each provider separately and test them offline:

# Capture OpenAI responses
npx replayci --provider openai --model gpt-4o-mini --capture-recordings \
  --pack packs/openai-v0.1

# Capture Anthropic responses
npx replayci --provider anthropic --model claude-sonnet-4-6 --capture-recordings \
  --pack packs/anthropic-v0.1

Shadow mode

Shadow mode lets you compare two providers side-by-side without affecting your primary results:

npx replayci --provider openai --model gpt-4o-mini \
  --shadow-capture \
  --shadow-provider anthropic --shadow-model claude-sonnet-4-6

The primary provider (OpenAI) determines pass/fail. The shadow provider (Anthropic) runs in parallel for comparison only — its results are captured but never affect your CI gate.

NeverNormalize

Each pack contains a NeverNormalize.json file that protects semantically important fields from being normalized away during baseline comparison:

{
  "schema_version": "1.0",
  "pack_id": "my-pack",
  "fields": [
    "tool_calls[].name",
    "tool_calls[].arguments",
    "content",
    "success"
  ]
}

These four fields should always be listed — they're the core semantic content of every response. Add any additional fields that are important to your specific use case (e.g., specific argument paths).

Do not list volatile fields like timestamps or request IDs — those should be normalized for stable fingerprints.

Every pack directory must contain this file, even if you're just starting out. The starter pack includes one with sensible defaults.

Next steps

Write contracts: See Writing Tests for the full YAML format
Set up CI: See CI Integration for GitHub Actions and GitLab CI examples
Debug issues: See Troubleshooting for common problems and solutions

Supported providers​

How provider abstraction works​

API key setup​

Model resolution​

How it works​

Aliases​

Families and request profiles​

Passthrough behavior​

Inspecting resolution​

The recorded provider​

Why use it​

How it works​

Recording file location​

What's in a recording file​

Boundary validation​

Switching providers​

Shadow mode​

NeverNormalize​

Next steps​

Supported providers

How provider abstraction works

API key setup

Model resolution

How it works

Aliases

Families and request profiles

Passthrough behavior

Inspecting resolution

The recorded provider

Why use it

How it works

Recording file location

What's in a recording file

Boundary validation

Switching providers

Shadow mode

NeverNormalize

Next steps