No-Key Quickstart
This workflow proves the local ReplayLab MVP without network access, real provider packages, or API keys.
The app in examples/dogfood_mvp imports local fake OpenAI-style and requests-style modules, then ReplayLab captures and replays those calls through the same provider adapter path used by real applications.
The provider support matrix also includes Anthropic Messages, Gemini generate_content, async
OpenAI Responses, core OpenAI/Anthropic/Gemini streaming calls, and async httpx.AsyncClient
calls; the quickstart keeps the command small by using the OpenAI + requests dogfood app.
For the public-alpha learning path with real providers, start with Tutorials.
Those tutorials use startup SDK instrumentation and handle.capture(...) as the primary app integration path.
For the shortest PyPI-installed first project with no API keys, use
First PyPI Project.
This no-key quickstart intentionally uses replaylab run because it is proving the local wrapper path without changing the fake app code.
1. Capture
uv run replaylab run \
--project-name dogfood-mvp \
--auto-patch-integrations auto \
--capture-payload-policy full \
-- python examples/dogfood_mvp/app.py
This writes a wrapper capsule and a separate child provider capsule under .replaylab/capsules/.
The child capsule contains the OpenAI and HTTP boundaries.
--auto-patch-integrations auto enables all supported patchers; providers that are not imported by
the app are no-ops.
2. Find The Provider Capsule
uv run replaylab capsule list --local-store-root .replaylab
Pick the capsule whose integrations include openai, requests, and auto_patch.
The wrapper capsule usually has no provider boundaries and is not the right replay target.
3. Inspect
uv run replaylab capsule inspect <child_capsule_id> --local-store-root .replaylab
Inspection prints run status, boundary counts, providers, integrations, and payload counts without printing payload file contents.
4. Regression Replay
uv run replaylab replay <child_capsule_id> \
--local-store-root .replaylab \
--auto-patch-integrations auto \
--report-id replay_dogfood_mvp \
-- python examples/dogfood_mvp/app.py
Regression replay serves recorded provider responses from the capsule.
The fake provider modules are still imported, but live provider calls are not made when replay matches the capsule.
The replay command also prints a Next steps block for report inspection, capsule/report
comparison, local viewer opening, and the local app. When ReplayLab does not know a safe app command
to show, it points you to the local app and avoids rendering a fake -- <app command> snippet.
For the normal human workflow after you already have a capsule, use the guided command instead of
running each follow-up by hand. If you want to browse everything already in .replaylab, start the
local app first:
uv run replaylab app --local-store-root .replaylab
replaylab app binds to 127.0.0.1, opens the browser by default, and shows product-level captured
runs with attached regression replays, live experiments, generated regressions, and exports. Internal
capsules, reports, action history, generated tests, and viewer files remain local storage details.
Experiment-created traces are inspectable as live-experiment evidence under the original captured
run, not as new primary captured-run baselines. The app starts with product start points for the
latest failed regression replay, latest regression replay, and latest captured run. If the latest
regression replay is also the latest failure, the start strip shows it once so the first choice is
not duplicated. The selected artifact loads deterministic diagnosis and one-click actions before
the raw tables. ReplayLab recovers a run profile from report/capsule
metadata when available, detects provider labels from captured boundaries, and exposes actions such
as replay, live experiment, compare to capture, generate regression, export evidence, and capture
again. In the product app, Run replay uses the sandboxed replay path with effect controls enforced;
older provider-only reports remain inspectable but are labeled as not sandboxed. The normal UI does
not ask you to type the app command or choose providers.
Fresh captures store local run profile metadata automatically so the app can run a regression
replay or explicit live experiment from a captured run. Older captures without that metadata show
app-managed replay setup candidates and a Regression replay with detected setup action when
ReplayLab can detect a runnable project module. Captured-run pages use a split-pane trace explorer:
the left pane keeps the Agent run, logical steps, LLM/API calls, Tool call rows, and provider
protocol evidence visible. When ReplayLab captures a framework tool execution boundary, the
provider tool request remains the visible Tool call row because it carries the parameters and
tool result; execution evidence and HTTP-client protocol/effect evidence attach to that row instead
of appearing as competing top-level provider calls. The right pane shows the selected node's
formatted redacted Input/Output previews, assistant messages, tool data, and replay metadata in
collapsed debug sections. The Captured runs page keeps the run list visible, shows
each run's regression replay count and latest regression replay status, opens the latest regression
replay result directly, and opens the selected run in a right-side inspector. The inspector has
an app-owned Local workflow hub plus Trace, Regression replays, Live experiments,
Generated regressions, and Upload preview tabs so the original capture, deterministic
regression evidence, live behavior exploration, generated guards, exports, and upload-safety
checks stay connected. The hub shows the latest result for each stage, the next recommended action,
and links back to the evidence after refresh or navigation. Regression replay result pages
lead with the verdict: recorded provider responses were served, live model behavior was not tested,
what regression replay verified, why it matters, what it does not prove, and whether the next action is
generating a provider replay guard, generating a diagnostic provider replay guard, or inspecting the
first issue. Clean replays expose Generate provider replay guard; failed or diverged replays expose
Generate diagnostic provider replay guard when ReplayLab can preserve a deterministic failure shape. Live
experiments are separate, explicit
actions that rerun with current code and live providers after a confirmation warning. The live
experiment workbench derives editable controls from the captured LLM trace: scenario input when
ReplayLab can safely map it to the recovered run argument, canonical instructions/system prompt,
model suggestions, temperature, top-p, output limits, tool choice, response format, response-mode
intent, and tool declarations such as descriptions and parameter schemas. It previews OpenAI,
Anthropic, and Gemini provider conversions with compatibility findings; preview-only conversions
can be selected for inspection, but the live run stays blocked until ReplayLab can preserve the
app call contract. Provider user-message overrides are under
advanced prompt controls because the normal path is to change the app scenario input. Tool
definitions are presented as readable tool cards and parameter summaries before raw JSON schema
editing. Unsupported controls are shown with a reason.
Completed live experiments are saved in the captured run's Live experiments tab with their label,
hypothesis, applied variant summary, and comparison verdict. The app
returns structured ReplayLab step status, not raw child stdout/stderr. Use
--no-open-browser --port 0 for scenarios or headless validation.
Generated provider replay guards and diagnostic provider replay guards are saved under the same captured run. Their history shows guard mode, source replay, generated path, hash, byte size, pytest result, and a copyable generated-test location so the file can be opened without searching raw action history. The app also shows whether the guard is only generated, locally runnable, or already detected in GitHub Actions. Use the generated-regression row to copy the local pytest command or a GitHub Actions snippet.
The same product surfaces include an Upload preview action for captured runs and their attached
evidence. This is a local-only pre-cloud safety check. It answers what would leave your machine in a
future upload by listing inventory rows, byte sizes, hashes, payload/ref counts, redaction notes,
excluded local-only material, risks, and cloud feature availability. The preview itself does not
upload anything. Use Upload now in the same panel for an explicit manual upload confirmation flow
that creates a hosted artifact and private share link. Auto-sync is available as an opt-in workspace
setting (default OFF) and syncs everything ReplayLab generates.
The upload surfaces do not display payload bodies, raw headers, API keys, raw command argv, source
bodies, framework bodies, or raw child stdout/stderr.
For a scriptable dry run, use the CLI fallback:
uv run replaylab cloud preview-upload captured_run <child_capsule_id> \
--local-store-root .replaylab
Use --format json when automation needs the same secret-safe preview document.
When you want ReplayLab to rerun the app from a script and produce a fresh report/viewer/test chain, use the same workflow backend from the CLI:
uv run replaylab workflow local <child_capsule_id> \
--local-store-root .replaylab \
--auto-patch-integrations auto \
--report-id replay_dogfood_guided \
--viewer-output replay-viewer.html \
--generate-test \
--test-output tests/regression/test_dogfood_replay.py \
--run-generated-test \
-- python examples/dogfood_mvp/app.py
workflow local replays the app, compares the report to the capsule, writes and opens the local
React viewer, and optionally generates and runs a pytest provider replay guard. The local app calls this same
orchestration path; use the lower-level commands below when you want to script or inspect one step
at a time.
5. Inspect And Compare The Report
uv run replaylab report inspect .replaylab/replays/replay_dogfood_mvp/report.json
uv run replaylab report compare \
<child_capsule_id> \
.replaylab/replays/replay_dogfood_mvp/report.json \
--local-store-root .replaylab
report compare exits 0 only when every expected boundary was replayed and there were no blocked, mismatched, extra, missing, or payload-unavailable results.
If you run deterministic regression replay again after changing the app, compare the old and new regression replay reports:
uv run replaylab report diff \
.replaylab/replays/replay_dogfood_mvp_before/report.json \
.replaylab/replays/replay_dogfood_mvp_after/report.json
report diff exits 0 when the new report is at least as clean as the baseline.
It exits 1 when the new report adds a mismatch, missing call, extra call, blocked call, or payload-unavailable result.
Optional BYOK AI assistance can explain that deterministic diff without changing its pass/fail result:
uv run replaylab report diff-explain \
.replaylab/replays/replay_dogfood_mvp_before/report.json \
.replaylab/replays/replay_dogfood_mvp_after/report.json \
--dry-run
Use --dry-run first to inspect the model-ready prompt. Real AI calls require
REPLAYLAB_AI_OPENAI_API_KEY or OPENAI_API_KEY.
To open that before/after comparison in a browser, use the viewer-first command:
uv run replaylab report view diff \
.replaylab/replays/replay_dogfood_mvp_before/report.json \
.replaylab/replays/replay_dogfood_mvp_after/report.json \
--output replay-diff-viewer.html
view diff writes the diagnostic file first. It uses the same pass/fail semantics as
report diff, so it exits 1 when the candidate report regresses.
6. Open A Local Viewer
uv run replaylab report view report \
.replaylab/replays/replay_dogfood_mvp/report.json \
--capsule <child_capsule_id> \
--local-store-root .replaylab \
--output replay-viewer.html
The command writes replay-viewer.html and opens it in your default browser.
The file is self-contained and includes the local React viewer shell.
It shows a top-level diagnosis, status, provider labels, replay counts, problem outcomes, request hashes,
payload availability booleans, quick filters, search, a "What to do next" section, copyable command
blocks, and grouped next commands.
The diagnosis text is shared with CLI report inspection, static HTML export, and the local app so
the same failed replay gets the same story across surfaces.
It does not render payload file contents, API keys, raw headers, or source bodies.
For failed reports it groups blocked, mismatched, extra, missing, and payload-unavailable outcomes
and shows expected-vs-actual provider, operation, resource, and request hash fields when available.
If you need the dependency-free fallback, use the static exporter:
uv run replaylab report export-html \
.replaylab/replays/replay_dogfood_mvp/report.json \
--capsule <child_capsule_id> \
--local-store-root .replaylab \
--output replay-report.html
7. Generate A Regression Test
uv run replaylab generate-test <child_capsule_id> \
--output tests/regression/test_dogfood_replay.py \
--fixture-root tests/fixtures/replaylab/capsules \
--app-root . \
--auto-patch-integrations auto \
-- python examples/dogfood_mvp/app.py
The generator copies the capsule fixture and writes a deterministic pytest test that calls replaylab replay.
Generated tests include secret-safe diagnostics that point to the replay report, capsule, summary counts, and non-replayed boundary messages when an assertion fails.
In replaylab app, generation is report-aware. A clean regression replay creates a passing provider
replay guard. A failed or diverged regression replay can create a diagnostic provider replay guard that intentionally
asserts the same failed replay shape, including status, boundary counts, problem outcomes, and
secret-safe hashes/messages. Use diagnostic provider replay guards to keep a known issue reproducible
while fixing it; do not treat them as correctness proof.
8. Run The Generated Test
uv run pytest tests/regression/test_dogfood_replay.py
The generated test should pass without network access.
Do not check generated local replay output under .replaylab/ into source control.
To wire generated guards into GitHub Actions, keep the generated pytest under tests/regression
and its fixture under tests/fixtures/replaylab/capsules, then run the same command in CI:
name: ReplayLab generated guards
on:
pull_request:
push:
branches: [main]
jobs:
replaylab-generated-guards:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: astral-sh/setup-uv@v5
- name: Install dependencies
run: uv sync --all-groups
- name: Run ReplayLab generated guards
run: uv run pytest tests/regression
Async HTTP
Async httpx.AsyncClient is supported through the same capture and replay commands:
uv run replaylab run \
--project-name async-httpx-app \
--auto-patch-integrations auto \
--capture-payload-policy full \
-- python app.py
Replay uses the same replaylab replay <capsule> -- python app.py flow and serves a ReplayHttpResponse without calling the live async provider.