Skip to content

AI-Assisted Diagnosis

ReplayLab's AI assistance is optional. It uses your own OpenAI-compatible key to explain deterministic ReplayLab artifacts and to suggest where instrumentation should go in an app.

AI output does not decide whether replay succeeded. Capsules, replay reports, comparisons, and scenario results remain the source of truth.

What You Will Do

In this tutorial you will:

  1. capture and replay a normal ReplayLab workflow
  2. ask ReplayLab to explain the replay report from secret-safe summaries
  3. ask ReplayLab to explain a baseline-vs-candidate replay report diff
  4. ask ReplayLab to plan instrumentation from Python AST structure
  5. run all commands in dry-run mode when you do not want to call a model

ReplayLab does not send payload file contents, source file bodies, API keys, or raw secret values by default.

Setup

Install ReplayLab from the current checkout:

uv sync --all-packages --all-groups

For a real AI call, set one of these environment variables:

export REPLAYLAB_AI_OPENAI_API_KEY="..."
# or
export OPENAI_API_KEY="..."

Optional settings:

export REPLAYLAB_AI_MODEL="gpt-5-mini"
export REPLAYLAB_AI_API_BASE_URL="https://api.openai.com/v1"

Use --dry-run when you want to inspect the prompt bundle without calling the model.

Capture And Replay First

Start from a full-payload capsule and replay report. The no-key dogfood flow is a safe way to create one:

uv run replaylab run \
  --project-name dogfood-mvp \
  --auto-patch-integrations auto \
  --capture-payload-policy full \
  -- python examples/dogfood_mvp/app.py

uv run replaylab capsule list --local-store-root .replaylab

Use the child provider capsule from the list output:

uv run replaylab replay <child_capsule_id> \
  --local-store-root .replaylab \
  --auto-patch-integrations auto \
  --report-id replay_dogfood_ai \
  -- python examples/dogfood_mvp/app.py

uv run replaylab report compare \
  <child_capsule_id> \
  .replaylab/replays/replay_dogfood_ai/report.json \
  --local-store-root .replaylab

Expected compare output:

ReplayLab replay comparison
Status: succeeded
Boundaries: expected=2, replayed=2, problems=0

This matters because the AI explanation should start from a deterministic report that already says what happened.

Explain A Replay Report

Run a dry run first:

uv run replaylab report explain \
  .replaylab/replays/replay_dogfood_ai/report.json \
  --capsule <child_capsule_id> \
  --local-store-root .replaylab \
  --dry-run

You should see:

ReplayLab AI replay explanation
Provider: openai
Model: gpt-5-mini
Dry run: no AI provider call was made.
System prompt:
...
User prompt:
...

The user prompt is model-ready JSON derived from capsule inspect, report inspect, and report compare. It includes IDs, providers, operations, resource names, outcomes, messages, request hashes, counts, and problem indexes. It does not include payload file contents.

When you are ready to call the model:

uv run replaylab report explain \
  .replaylab/replays/replay_dogfood_ai/report.json \
  --capsule <child_capsule_id> \
  --local-store-root .replaylab \
  --output /tmp/replaylab-ai-explanation.json

Expected output:

ReplayLab AI replay explanation
Provider: openai
Model: gpt-5-mini
Capsule: ...
Report: replay_dogfood_ai
Report status: succeeded
Comparison status: succeeded
Prompt hash: ...
...

The JSON output file contains the typed explanation model and the prompt hash so you can review what was sent without reading raw payloads.

Explain A Replay Report Diff

Use deterministic diff first:

uv run replaylab report diff \
  .replaylab/replays/replay_dogfood_mvp_before/report.json \
  .replaylab/replays/replay_dogfood_mvp_after/report.json

report diff is authoritative. It exits 0 when the candidate report is at least as clean as the baseline, and 1 when the candidate introduces a new problem outcome or the reports are incompatible.

When you want a plain-language explanation of that deterministic diff, run a dry run:

uv run replaylab report diff-explain \
  .replaylab/replays/replay_dogfood_mvp_before/report.json \
  .replaylab/replays/replay_dogfood_mvp_after/report.json \
  --dry-run

You should see:

ReplayLab AI replay diff explanation
Provider: openai
Model: gpt-5-mini
Diff status: ...
Prompt hash: ...
Dry run: no AI provider call was made.

The prompt summary is built from report inspect for both reports and report diff for the deterministic comparison. It includes statuses, IDs, counts, changed indexes, change kinds, outcomes, messages, providers, operations, resource names, and request hashes where those are already part of the report summaries. It does not include payload file contents, raw headers, API keys, or source bodies.

When you are ready for advisory output:

uv run replaylab report diff-explain \
  .replaylab/replays/replay_dogfood_mvp_before/report.json \
  .replaylab/replays/replay_dogfood_mvp_after/report.json \
  --output /tmp/replaylab-ai-diff-explanation.json

Use the explanation to decide what to inspect next. The deterministic diff still decides whether the candidate improved or regressed.

Plan Instrumentation

Run the planner against a Python app root:

uv run replaylab ai plan-instrumentation \
  --app-root . \
  --providers openai,requests,httpx \
  --dry-run

Expected output:

ReplayLab AI instrumentation plan
Provider: openai
Model: gpt-5-mini
App root: ...
Requested providers: openai, requests, httpx
Findings: ...
Dry run: no AI provider call was made.

ReplayLab scans Python files with AST and sends structural facts by default:

  • provider imports such as openai, requests, and httpx
  • FastAPI or Starlette app candidates
  • job-like functions or decorators
  • existing ReplayLab usage

It excludes .git, .venv, .replaylab, dist, caches, generated ReplayLab fixtures, and source file bodies.

When you are ready for advisory output:

uv run replaylab ai plan-instrumentation \
  --app-root . \
  --providers openai,requests,httpx \
  --output /tmp/replaylab-instrumentation-plan.json

Use the plan as a review aid. The next step is still to make a normal code change, update docs, and validate with scenarios.

Validate Without External AI

Maintainers can run the loopback scenario:

python scripts/run_scenario.py run ai-diagnosis-loopback --keep-workspace

Expected ending:

ReplayLab scenario passed.
Scenario: ai-diagnosis-loopback
Tier: loopback
Boundaries: 1
Providers: requests

This proves report explanation, report-diff explanation, and instrumentation planning work against an OpenAI Responses-compatible local endpoint and that no real external AI call is required for CI-style validation.