Skip to main content

Quickstart — ReplayCI

Models change without warning. GPT-4o called check_service_health correctly yesterday — today after a silent update, it skips a required tool in the protocol. Your app breaks with no code change. ReplayCI catches that.


Two ways to start

Try it (60 seconds)Test your real app (recommended)
WhatRun a pre-recorded test to see how ReplayCI worksWrap your OpenAI/Anthropic client, auto-generate contracts from real traffic
PrerequisitesNode.js 20+Node.js 20+ and an API key for your LLM provider
Signup needed?NoYes — app.replayci.com/signup
Time~1 minute~5 minutes

Most users should start with "Test your real app" — it's the path that protects your production tool calls. The "Try it" path is just a quick demo to see the runner in action.


Path A: Try it (60 seconds, no signup)

See what ReplayCI does before changing any code:

npm install -D @replayci/cli
npx replayci init
npx replayci

This runs a pre-recorded test against a starter contract — no API keys, no config, no code changes.

  replayci · recorded/recorded · 1 contract

✓ incident_response Pass 71ac81a7

1/1 passed

That green checkmark means the model called all 4 tools in the correct order with valid arguments. When it doesn't, you'll see exactly what went wrong and a fingerprint that tracks the failure across runs.

The starter pack proves the runner works. For your actual app, continue to Path B below — that's where ReplayCI becomes useful.


Path B: Test your real app (recommended)

The SDK wraps your existing OpenAI or Anthropic client. Your code stays the same — ReplayCI observes every tool call in the background and auto-generates contracts on the dashboard.

1. Create your account

  1. Go to app.replayci.com/signup
  2. After signup, copy your API key from the onboarding page — it starts with rci_live_
  3. It's only shown once (you can create new keys later under Settings > API Keys)

2. Install the SDK

npm install @replayci/replay

Monorepos: Run npm install from the directory where your app code lives, not the repo root. In monorepos, installing from root can resolve @replayci/replay in the wrong node_modules, causing import errors at runtime.

3. Wrap your client

import OpenAI from "openai";
import { observe } from "@replayci/replay";

const openai = new OpenAI();

// One line — observe wraps your client, captures every tool call
const { client, flush, restore } = observe(openai, {
apiKey: process.env.REPLAYCI_API_KEY,
});

// Use your client exactly as before
const response = await openai.chat.completions.create({
model: "gpt-4o-mini",
messages: [{ role: "user", content: "What's the weather in Paris?" }],
tools: [{ type: "function", function: { name: "get_weather", parameters: { /* ... */ } } }],
});

// When you're done, flush captured calls and restore the original client
await flush();
restore();

Module format: The examples use ES module import syntax. If your project uses CommonJS, require() also works:

const { observe } = require("@replayci/replay");

For ESM, ensure your package.json has "type": "module".

That's it. Every tool call flows to app.replayci.com where contracts are generated automatically.

Works with Anthropic too: The SDK auto-detects your client. Just pass an Anthropic client instead — same API, no config changes.

import Anthropic from "@anthropic-ai/sdk";
import { observe } from "@replayci/replay";

const anthropic = new Anthropic();
const { client, flush, restore } = observe(anthropic, {
apiKey: process.env.REPLAYCI_API_KEY,
});

4. Review contracts on the dashboard

Once calls start flowing, the dashboard at app.replayci.com will:

  • Auto-generate contracts from observed tool calls
  • Show pass/fail status for each contract
  • Let you tweak rules — adjust thresholds, add checks, pin contracts you trust

No YAML to write. No fixtures to create. You review what was generated and promote what looks right.


What's a contract?

A contract is a set of rules for tool calls: "when the model gets this incident report, it should call 4 tools in order — check_service_health, pull_service_logs, search_past_incidents, create_incident_ticket — with the right arguments." ReplayCI checks every call against these rules and tells you when something breaks.

Behind the scenes, contracts are YAML files — but with the SDK path, you never need to write them by hand. They're auto-generated from your real traffic and managed in the dashboard.

What the YAML looks like (for reference)
tool: multi_tool_call

expect_tools:
- check_service_health
- pull_service_logs
- search_past_incidents
- create_incident_ticket

tool_order: strict
pass_threshold: 1.0

expected_tool_calls:
- name: check_service_health
argument_invariants:
- path: $.service_id
equals: "payment-gateway-us-east-1"

- name: create_incident_ticket
argument_invariants:
- path: $.severity
equals: "P1"

golden_cases:
- id: incident_response_success
input_ref: incident_response.success.json
expect_ok: true
provider_modes: ["recorded"]

See Writing Tests for the full contract format.


What happens when something breaks?

Every failure is classified (tool_not_invoked, wrong_tool, schema_violation, etc.) and assigned a fingerprint — a short hash that stays the same when the same failure recurs. You'll see these on the dashboard with trends over time.


See Results in Your Dashboard (CLI path)

If you're running contracts via the CLI (not the SDK), set your API key to push results to the dashboard automatically:

export REPLAYCI_API_KEY=rci_live_your_key_here
npx replayci --pack packs/my-pack --provider openai --model 4o-mini

You can use short aliases like 4o-mini instead of full model IDs — ReplayCI resolves them automatically. See Providers — Model resolution for the full alias list.

Results appear at app.replayci.com within seconds. No --persist flag needed — the push happens automatically when REPLAYCI_API_KEY is set.

--persist is for local development only. It writes artifacts to your local disk and requires a PostgreSQL database. For hosted results, use the API key instead.


Add to CI

Once you have contracts you trust, lock them down in your pipeline:

npx replayci --provider recorded

This runs contracts against recorded fixtures — deterministic, no API calls, no cost. If a contract fails, CI fails. See CI Integration for GitHub Actions and GitLab setup.


Test Against a Live Model

To run contracts against a real LLM (not recorded fixtures):

export REPLAYCI_PROVIDER_KEY="sk-your-openai-key"
npx replayci --provider openai --model gpt-4o-mini

See Providers for Anthropic and other provider configuration.


Next steps

  • SDK Integrationobserve(), validate(), health monitoring — start here if you haven't wrapped your client yet
  • Observe Guide — auto-generate contracts from live calls via CLI
  • CI Integration — block regressions in your pipeline
  • Writing Tests — full contract format, assertions, golden cases (for hand-written contracts)
  • CLI Reference — all flags: --repeat, --capture-recordings, shadow mode, sync, compare, doctor

Tip: After integrating the SDK, run npx replayci doctor to verify everything is connected and working.