ReplayLab In 10 Minutes
This walkthrough is the customer demo loop: capture a support-bot run, stop the providers, open the local app, replay offline from the app, inspect clean and failed diagnoses, compare an intentional mismatch, and generate a pytest provider replay guard.
The demo uses local loopback services, so it needs no API keys and makes no paid provider calls. It
still installs the public PyPI package and uses the real OpenAI Python SDK plus real requests.
Run The Demo
From the ReplayLab repo, run:
python scripts/run_scenario.py run support-bot-demo-local --keep-workspace --package-version 0.1.0a4
Expected ending:
ReplayLab scenario passed.
Scenario: support-bot-demo-local
Tier: loopback
Boundaries: 2
Payloads: 4
Providers: openai, requests
Keep the workspace path from the output. It contains the generated app, capsule, clean replay report, failed replay report, React viewer files, and generated pytest provider replay guard.
For the maintainer rehearsal on the current checkout, use the published-package scenario as the artifact generator and then start the current local app from the kept workspace:
cd <kept-workspace>
uv run --project /path/to/ReplayLab replaylab app --local-store-root .replaylab
For an installed future package that includes the current local app workflow, the command is simply:
replaylab app --local-store-root .replaylab
Presenter Flow
Use this order when showing ReplayLab to a technical evaluator.
| Time | Show | What To Say |
|---|---|---|
| 0:00-1:30 | support_bot.py |
"This is normal app code: one ReplayLab startup call, one capture scope around the support-bot workflow, and normal OpenAI plus HTTP client calls." |
| 1:30-3:00 | Scenario output | "ReplayLab captured two external boundaries: the support-ticket API and the OpenAI Responses call. The providers are stopped before replay, so a missed interception would fail." |
| 3:00-4:30 | replaylab app start screen |
"Start here shows the latest failed regression replay, latest regression replay, and latest captured run. I do not need to reconstruct paths manually." |
| 4:30-6:30 | Selected failed report | "The failed report starts with the replay story: boundary #1 mismatched, expected OpenAI request hash versus actual request hash, and the first recommended action." |
| 6:30-8:00 | Artifact actions | "The app recovered a run profile from the report and detected requests,openai from captured boundaries, so replay and regression actions are buttons, not command fields." |
| 8:00-9:00 | App-generated viewer and diff viewer | "The local viewer and diff viewer are shareable fallbacks. They point back to replaylab app, not fake python app.py commands." |
| 9:00-10:00 | Generated pytest | "The captured provider-boundary behavior becomes a pytest provider replay guard that passes without the loopback providers." |
Demo App Shape
The generated app looks like a small production integration:
handle = replaylab.init(
project_name="support-bot-demo",
auto_patch_integrations="auto",
capture_payload_policy=CapturePayloadPolicy.FULL,
)
with handle.capture(
"triage_ticket",
session_id=ticket_id,
labels=("demo", "support"),
):
ticket = requests.get(f"{support_url}/tickets/{ticket_id}", timeout=5).json()
response = openai.OpenAI(base_url=f"{openai_url}/v1", api_key="...").responses.create(
model="gpt-5-mini",
input=f"triage ticket {ticket_id}",
)
ReplayLab is not a background listener. The app initializes ReplayLab once, keeps normal provider
client code, and scopes the support-bot workflow with handle.capture(...).
Open The Artifacts
Inside the kept workspace, start with the local app:
replaylab app --local-store-root .replaylab
The app should show start points for the latest failed regression replay, latest regression replay,
and latest captured run without duplicating a replay when the newest replay is also the newest
failure. The app groups local storage into product artifacts: one captured run owns its regression
replays, live experiments, generated regressions, and exports. Select the latest failed regression
replay first, then the clean replay. Reports and captured runs with recovered run profiles expose
separate actions for deterministic regression replay, live experiments against current providers,
comparison, regression generation, and evidence export. The live experiment workbench can apply
supported variants such as app scenario input, OpenAI Responses instructions, model parameters, and
tool declaration descriptions before the live run. The workbench keeps direct provider user-message
overrides in advanced prompt controls and presents tool parameters as readable cards before raw JSON
schema editing. Live experiment results stay attached to the captured
run, so you can reopen them from the Live experiments tab after navigating away. When a run profile is missing, action buttons
are disabled and the app asks you to capture once from the project and refresh instead of showing a
fake command. Generated pytest remains opt-in.
The scenario also writes shareable viewer artifacts:
reports/support-bot-clean-viewer.html
reports/support-bot-failed-viewer.html
reports/support-bot-diff-viewer.html
tests/regression/test_support_bot_replay.py
The clean viewer should show a clean diagnosis, one requests boundary, one openai boundary, two
replayed outcomes, request hashes, payload availability booleans, and grouped next commands.
The failed viewer should point to the OpenAI request-hash mismatch. The diff viewer should show the
candidate replay regressed compared with the clean baseline. Current viewer exports should point to
replaylab app --local-store-root ... for guided follow-up instead of rendering placeholder app
commands.
To rerun the generated test manually from the kept workspace:
source .venv/bin/activate
pytest tests/regression/test_support_bot_replay.py
Expected output:
1 passed
Customer Questions
Does this require ReplayLab Cloud? No. This demo is fully local. It writes local capsules, replay reports, viewer HTML files, and pytest fixtures.
Were providers really stopped during replay?
Yes. The scenario stops both loopback providers before running replaylab replay. A missed
interception would fail instead of silently calling a live service.
Does the viewer expose secrets or payload bodies? No. The viewer shows hashes, providers, operations, outcomes, payload availability booleans, and diagnostic messages. It does not render payload file contents, API keys, raw headers, or source bodies.
What does full-payload capture mean? ReplayLab stores provider request and response payload refs in the local capsule so replay can serve recorded responses later. The viewer still avoids rendering those payload bodies.
What is not supported yet? The current public alpha is local-first. Hosted issue grouping, cloud upload, auth, streaming, framework-native traces, and broad provider coverage are still future work.
Why should this matter to my team? The value is the loop: capture a real run once, replay it safely offline, inspect what changed, and turn the captured behavior into a regression test.
Next Step
After the demo, replace the loopback providers with one real OpenAI or HTTP call from your own app using the same startup instrumentation and capture scope.