Skip to content

No-Key Quickstart

This workflow proves the local ReplayLab MVP without network access, real provider packages, or API keys. The app in examples/dogfood_mvp imports local fake OpenAI-style and requests-style modules, then ReplayLab captures and replays those calls through the same provider adapter path used by real applications. The provider support matrix also includes Anthropic Messages, Gemini generate_content, async OpenAI Responses, core OpenAI/Anthropic/Gemini streaming calls, and async httpx.AsyncClient calls; the quickstart keeps the command small by using the OpenAI + requests dogfood app.

For the public-alpha learning path with real providers, start with Tutorials. Those tutorials use startup SDK instrumentation and handle.capture(...) as the primary app integration path. For the shortest PyPI-installed first project with no API keys, use First PyPI Project. This no-key quickstart intentionally uses replaylab run because it is proving the local wrapper path without changing the fake app code.

1. Capture

uv run replaylab run \
  --project-name dogfood-mvp \
  --auto-patch-integrations auto \
  --capture-payload-policy full \
  -- python examples/dogfood_mvp/app.py

This writes a wrapper capsule and a separate child provider capsule under .replaylab/capsules/. The child capsule contains the OpenAI and HTTP boundaries. --auto-patch-integrations auto enables all supported patchers; providers that are not imported by the app are no-ops.

2. Find The Provider Capsule

uv run replaylab capsule list --local-store-root .replaylab

Pick the capsule whose integrations include openai, requests, and auto_patch. The wrapper capsule usually has no provider boundaries and is not the right replay target.

3. Inspect

uv run replaylab capsule inspect <child_capsule_id> --local-store-root .replaylab

Inspection prints run status, boundary counts, providers, integrations, and payload counts without printing payload file contents.

4. Regression Replay

uv run replaylab replay <child_capsule_id> \
  --local-store-root .replaylab \
  --auto-patch-integrations auto \
  --report-id replay_dogfood_mvp \
  -- python examples/dogfood_mvp/app.py

Regression replay serves recorded provider responses from the capsule. The fake provider modules are still imported, but live provider calls are not made when replay matches the capsule. The replay command also prints a Next steps block for report inspection, capsule/report comparison, local viewer opening, and the local app. When ReplayLab does not know a safe app command to show, it points you to the local app and avoids rendering a fake -- <app command> snippet.

For the normal human workflow after you already have a capsule, use the guided command instead of running each follow-up by hand. If you want to browse everything already in .replaylab, start the local app first:

uv run replaylab app --local-store-root .replaylab

replaylab app binds to 127.0.0.1, opens the browser by default, and shows product-level captured runs with attached regression replays, live experiments, generated regressions, and exports. Internal capsules, reports, action history, generated tests, and viewer files remain local storage details. Experiment-created traces are inspectable as live-experiment evidence under the original captured run, not as new primary captured-run baselines. The app starts with product start points for the latest failed regression replay, latest regression replay, and latest captured run. If the latest regression replay is also the latest failure, the start strip shows it once so the first choice is not duplicated. The selected artifact loads deterministic diagnosis and one-click actions before the raw tables. ReplayLab recovers a run profile from report/capsule metadata when available, detects provider labels from captured boundaries, and exposes actions such as replay, live experiment, compare to capture, generate regression, export evidence, and capture again. In the product app, Run replay uses the sandboxed replay path with effect controls enforced; older provider-only reports remain inspectable but are labeled as not sandboxed. The normal UI does not ask you to type the app command or choose providers. Fresh captures store local run profile metadata automatically so the app can run a regression replay or explicit live experiment from a captured run. Older captures without that metadata show app-managed replay setup candidates and a Regression replay with detected setup action when ReplayLab can detect a runnable project module. Captured-run pages use a split-pane trace explorer: the left pane keeps the Agent run, logical steps, LLM/API calls, Tool call rows, and provider protocol evidence visible. When ReplayLab captures a framework tool execution boundary, the provider tool request remains the visible Tool call row because it carries the parameters and tool result; execution evidence and HTTP-client protocol/effect evidence attach to that row instead of appearing as competing top-level provider calls. The right pane shows the selected node's formatted redacted Input/Output previews, assistant messages, tool data, and replay metadata in collapsed debug sections. The Captured runs page keeps the run list visible, shows each run's regression replay count and latest regression replay status, opens the latest regression replay result directly, and opens the selected run in a right-side inspector. The inspector has an app-owned Local workflow hub plus Trace, Regression replays, Live experiments, Generated regressions, and Upload preview tabs so the original capture, deterministic regression evidence, live behavior exploration, generated guards, exports, and upload-safety checks stay connected. The hub shows the latest result for each stage, the next recommended action, and links back to the evidence after refresh or navigation. Regression replay result pages lead with the verdict: recorded provider responses were served, live model behavior was not tested, what regression replay verified, why it matters, what it does not prove, and whether the next action is generating a provider replay guard, generating a diagnostic provider replay guard, or inspecting the first issue. Clean replays expose Generate provider replay guard; failed or diverged replays expose Generate diagnostic provider replay guard when ReplayLab can preserve a deterministic failure shape. Live experiments are separate, explicit actions that rerun with current code and live providers after a confirmation warning. The live experiment workbench derives editable controls from the captured LLM trace: scenario input when ReplayLab can safely map it to the recovered run argument, canonical instructions/system prompt, model suggestions, temperature, top-p, output limits, tool choice, response format, response-mode intent, and tool declarations such as descriptions and parameter schemas. It previews OpenAI, Anthropic, and Gemini provider conversions with compatibility findings; preview-only conversions can be selected for inspection, but the live run stays blocked until ReplayLab can preserve the app call contract. Provider user-message overrides are under advanced prompt controls because the normal path is to change the app scenario input. Tool definitions are presented as readable tool cards and parameter summaries before raw JSON schema editing. Unsupported controls are shown with a reason. Completed live experiments are saved in the captured run's Live experiments tab with their label, hypothesis, applied variant summary, and comparison verdict. The app returns structured ReplayLab step status, not raw child stdout/stderr. Use --no-open-browser --port 0 for scenarios or headless validation.

Generated provider replay guards and diagnostic provider replay guards are saved under the same captured run. Their history shows guard mode, source replay, generated path, hash, byte size, pytest result, and a copyable generated-test location so the file can be opened without searching raw action history. The app also shows whether the guard is only generated, locally runnable, or already detected in GitHub Actions. Use the generated-regression row to copy the local pytest command or a GitHub Actions snippet.

The same product surfaces include an Upload preview action for captured runs and their attached evidence. This is a local-only pre-cloud safety check. It answers what would leave your machine in a future upload by listing inventory rows, byte sizes, hashes, payload/ref counts, redaction notes, excluded local-only material, risks, and cloud feature availability. The preview itself does not upload anything. Use Upload now in the same panel for an explicit manual upload confirmation flow that creates a hosted artifact and private share link. Auto-sync is available as an opt-in workspace setting (default OFF) and syncs everything ReplayLab generates. The upload surfaces do not display payload bodies, raw headers, API keys, raw command argv, source bodies, framework bodies, or raw child stdout/stderr.

For a scriptable dry run, use the CLI fallback:

uv run replaylab cloud preview-upload captured_run <child_capsule_id> \
  --local-store-root .replaylab

Use --format json when automation needs the same secret-safe preview document.

When you want ReplayLab to rerun the app from a script and produce a fresh report/viewer/test chain, use the same workflow backend from the CLI:

uv run replaylab workflow local <child_capsule_id> \
  --local-store-root .replaylab \
  --auto-patch-integrations auto \
  --report-id replay_dogfood_guided \
  --viewer-output replay-viewer.html \
  --generate-test \
  --test-output tests/regression/test_dogfood_replay.py \
  --run-generated-test \
  -- python examples/dogfood_mvp/app.py

workflow local replays the app, compares the report to the capsule, writes and opens the local React viewer, and optionally generates and runs a pytest provider replay guard. The local app calls this same orchestration path; use the lower-level commands below when you want to script or inspect one step at a time.

5. Inspect And Compare The Report

uv run replaylab report inspect .replaylab/replays/replay_dogfood_mvp/report.json

uv run replaylab report compare \
  <child_capsule_id> \
  .replaylab/replays/replay_dogfood_mvp/report.json \
  --local-store-root .replaylab

report compare exits 0 only when every expected boundary was replayed and there were no blocked, mismatched, extra, missing, or payload-unavailable results.

If you run deterministic regression replay again after changing the app, compare the old and new regression replay reports:

uv run replaylab report diff \
  .replaylab/replays/replay_dogfood_mvp_before/report.json \
  .replaylab/replays/replay_dogfood_mvp_after/report.json

report diff exits 0 when the new report is at least as clean as the baseline. It exits 1 when the new report adds a mismatch, missing call, extra call, blocked call, or payload-unavailable result.

Optional BYOK AI assistance can explain that deterministic diff without changing its pass/fail result:

uv run replaylab report diff-explain \
  .replaylab/replays/replay_dogfood_mvp_before/report.json \
  .replaylab/replays/replay_dogfood_mvp_after/report.json \
  --dry-run

Use --dry-run first to inspect the model-ready prompt. Real AI calls require REPLAYLAB_AI_OPENAI_API_KEY or OPENAI_API_KEY.

To open that before/after comparison in a browser, use the viewer-first command:

uv run replaylab report view diff \
  .replaylab/replays/replay_dogfood_mvp_before/report.json \
  .replaylab/replays/replay_dogfood_mvp_after/report.json \
  --output replay-diff-viewer.html

view diff writes the diagnostic file first. It uses the same pass/fail semantics as report diff, so it exits 1 when the candidate report regresses.

6. Open A Local Viewer

uv run replaylab report view report \
  .replaylab/replays/replay_dogfood_mvp/report.json \
  --capsule <child_capsule_id> \
  --local-store-root .replaylab \
  --output replay-viewer.html

The command writes replay-viewer.html and opens it in your default browser. The file is self-contained and includes the local React viewer shell. It shows a top-level diagnosis, status, provider labels, replay counts, problem outcomes, request hashes, payload availability booleans, quick filters, search, a "What to do next" section, copyable command blocks, and grouped next commands. The diagnosis text is shared with CLI report inspection, static HTML export, and the local app so the same failed replay gets the same story across surfaces. It does not render payload file contents, API keys, raw headers, or source bodies. For failed reports it groups blocked, mismatched, extra, missing, and payload-unavailable outcomes and shows expected-vs-actual provider, operation, resource, and request hash fields when available.

If you need the dependency-free fallback, use the static exporter:

uv run replaylab report export-html \
  .replaylab/replays/replay_dogfood_mvp/report.json \
  --capsule <child_capsule_id> \
  --local-store-root .replaylab \
  --output replay-report.html

7. Generate A Regression Test

uv run replaylab generate-test <child_capsule_id> \
  --output tests/regression/test_dogfood_replay.py \
  --fixture-root tests/fixtures/replaylab/capsules \
  --app-root . \
  --auto-patch-integrations auto \
  -- python examples/dogfood_mvp/app.py

The generator copies the capsule fixture and writes a deterministic pytest test that calls replaylab replay. Generated tests include secret-safe diagnostics that point to the replay report, capsule, summary counts, and non-replayed boundary messages when an assertion fails.

In replaylab app, generation is report-aware. A clean regression replay creates a passing provider replay guard. A failed or diverged regression replay can create a diagnostic provider replay guard that intentionally asserts the same failed replay shape, including status, boundary counts, problem outcomes, and secret-safe hashes/messages. Use diagnostic provider replay guards to keep a known issue reproducible while fixing it; do not treat them as correctness proof.

8. Run The Generated Test

uv run pytest tests/regression/test_dogfood_replay.py

The generated test should pass without network access. Do not check generated local replay output under .replaylab/ into source control.

To wire generated guards into GitHub Actions, keep the generated pytest under tests/regression and its fixture under tests/fixtures/replaylab/capsules, then run the same command in CI:

name: ReplayLab generated guards

on:
  pull_request:
  push:
    branches: [main]

jobs:
  replaylab-generated-guards:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: astral-sh/setup-uv@v5
      - name: Install dependencies
        run: uv sync --all-groups
      - name: Run ReplayLab generated guards
        run: uv run pytest tests/regression

Async HTTP

Async httpx.AsyncClient is supported through the same capture and replay commands:

uv run replaylab run \
  --project-name async-httpx-app \
  --auto-patch-integrations auto \
  --capture-payload-policy full \
  -- python app.py

Replay uses the same replaylab replay <capsule> -- python app.py flow and serves a ReplayHttpResponse without calling the live async provider.