Tutorial: Capture And Replay OpenAI Responses

This tutorial captures a real OpenAI Responses call from normal Python app startup, replays it locally, compares the replay report, and generates a pytest provider replay guard.

ReplayLab is not a listener running in the background. Your app initializes the SDK once, ReplayLab patches the configured provider client in that Python process, and handle.capture(...) scopes the request, job, or session you want to persist as one capsule.

ReplayLab's OpenAI adapter follows the current OpenAI Responses Python quickstart shape:

client = openai.OpenAI()
response = client.responses.create(model="gpt-5-mini", input="...")
print(response.output_text)

Streaming responses through responses.create(..., stream=True) are supported as event-preserving ReplayLab evidence when the application fully consumes the stream. The main tutorial uses a non-streaming call because it is the smallest path to read. The examples below use gpt-5-mini as a low-cost default and let you override it with REPLAYLAB_TUTORIAL_OPENAI_MODEL.

Setup

Install the packages and set OPENAI_API_KEY. Do not paste secret values into tutorial files, notebooks, docs, or terminal output. Set REPLAYLAB_TUTORIAL_OPENAI_MODEL only if you want to use a different accessible Responses model.

python -m venv .venv
source .venv/bin/activate
pip install replaylab openai
export OPENAI_API_KEY="..."

For local repo development before package publication:

uv sync --all-packages --all-groups
uv pip install openai
export OPENAI_API_KEY="..."

Startup Instrumentation

Create tutorial_openai_app.py. The app keeps normal OpenAI provider code. ReplayLab setup happens near startup, before the OpenAI client is constructed.

import os

import openai
import replaylab
from replaylab import CapturePayloadPolicy

MODEL = os.environ.get("REPLAYLAB_TUTORIAL_OPENAI_MODEL", "gpt-5-mini")


def call_model() -> str:
    client = openai.OpenAI()
    response = client.responses.create(
        model=MODEL,
        input=(
            "Write one short sentence explaining why deterministic "
            "replay helps agent tests."
        ),
        reasoning={"effort": "minimal"},
        max_output_tokens=200,
    )
    return response.output_text


def call_model_streaming() -> str:
    client = openai.OpenAI()
    stream = client.responses.create(
        model=MODEL,
        input="Explain deterministic replay in one sentence.",
        reasoning={"effort": "minimal"},
        max_output_tokens=200,
        stream=True,
    )
    for _event in stream:
        pass
    final_response = stream.get_final_response()
    return final_response.output_text


def main() -> None:
    handle = replaylab.init(
        project_name="tutorial-openai",
        auto_patch_integrations=("openai",),
        capture_payload_policy=CapturePayloadPolicy.FULL,
    )

    with handle.capture(
        "openai_responses_tutorial",
        labels=("tutorial", "openai"),
        runtime_metadata={"openai.model": MODEL},
    ) as capture:
        print(call_model())

    if capture.capsule is not None:
        print(f"ReplayLab capsule: {capture.capsule.capsule_path}")


if __name__ == "__main__":
    main()

Capture Scope

Capture uses your normal app command. There is no ReplayLab wrapper process in this production-style path.

uv run python tutorial_openai_app.py

What good looks like:

<one sentence from the model>
ReplayLab capsule: .replaylab/capsules/<capsule_id>

Capsule List

uv run replaylab capsule list --local-store-root .replaylab

Pick the capsule whose integrations include openai, auto_patch, and same_process.

Inspect

uv run replaylab capsule inspect <capsule_id> --local-store-root .replaylab

What good looks like:

Boundaries: 1
Providers: openai
Payloads: 2
Boundary 0: llm openai create openai.responses (succeeded)

Replay

Replay is local regression tooling. It runs the same application command under replaylab replay; when the request matches the capsule, ReplayLab serves the recorded OpenAI response instead of calling the live provider. The app keeps the same startup replaylab.init(...) and handle.capture(...) code in replay mode; ReplayLab treats that capture scope as a no-op while the CLI-owned replay runtime serves provider calls.

uv run replaylab replay <capsule_id> \
  --local-store-root .replaylab \
  --auto-patch-integrations openai \
  --report-id replay_tutorial_openai \
  -- python tutorial_openai_app.py

Compare

uv run replaylab report compare \
  <capsule_id> \
  .replaylab/replays/replay_tutorial_openai/report.json \
  --local-store-root .replaylab

What good looks like:

Status: succeeded
Expected boundaries: 1
Replayed: 1
Problems: 0

Generate-Test

uv run replaylab generate-test <capsule_id> \
  --output tests/regression/test_tutorial_openai_replay.py \
  --fixture-root tests/fixtures/replaylab/capsules \
  --app-root . \
  --auto-patch-integrations openai \
  -- python tutorial_openai_app.py

Run the generated test:

uv run pytest tests/regression/test_tutorial_openai_replay.py

The generated test uses replaylab replay, asserts the replay report, and avoids a live OpenAI call.

Maintainer Streaming Scenario

Maintainers can validate streaming without an OpenAI API key:

python scripts/run_scenario.py run openai-streaming-local --keep-workspace

That scenario uses the real openai SDK against a local fake Responses streaming server, captures a fully consumed responses.create(..., stream=True) call, replays the recorded event sequence, generates a guard, runs pytest, and checks the local app trace shape.