Skip to content

Tutorial: Capture And Replay Gemini Generate Content

This tutorial captures a real Google Gen AI Gemini generate_content call from normal Python app startup, replays it locally, compares the replay report, and generates a pytest regression.

ReplayLab's Gemini support is native provider-SDK support. It wraps Client().models.generate_content(...), Client().aio.models.generate_content(...), and models.generate_content_stream(...) for sync and async clients. Multimodal file/image/video flows, Gemini Live, and Vertex-specific product breadth are not supported in this slice.

Setup

Install the packages and set GEMINI_API_KEY. Do not paste secret values into tutorial files, notebooks, docs, or terminal output.

python -m venv .venv
source .venv/bin/activate
pip install replaylab google-genai
export GEMINI_API_KEY="..."

For local repo development before package publication:

uv sync --all-packages --all-groups
uv pip install google-genai
export GEMINI_API_KEY="..."

Startup Instrumentation

Create tutorial_gemini_app.py. The app keeps normal Google Gen AI provider code. ReplayLab setup happens near startup, before the Gemini client is constructed.

import os

import replaylab
from google import genai
from replaylab import CapturePayloadPolicy

MODEL = os.environ.get("REPLAYLAB_TUTORIAL_GEMINI_MODEL", "gemini-2.5-flash")


def call_model() -> str:
    client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
    response = client.models.generate_content(
        model=MODEL,
        contents="Explain why deterministic replay helps agent regression tests.",
    )
    return response.text or ""


def call_model_streaming() -> str:
    client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
    chunks = client.models.generate_content_stream(
        model=MODEL,
        contents="Explain why deterministic replay helps agent regression tests.",
    )
    return "".join(chunk.text or "" for chunk in chunks)


def main() -> None:
    handle = replaylab.init(
        project_name="tutorial-gemini",
        auto_patch_integrations=("gemini",),
        capture_payload_policy=CapturePayloadPolicy.FULL,
    )

    with handle.capture(
        "gemini_generate_content_tutorial",
        labels=("tutorial", "gemini"),
        runtime_metadata={"gemini.model": MODEL},
    ) as capture:
        print(call_model())

    if capture.capsule is not None:
        print(f"ReplayLab capsule: {capture.capsule.capsule_path}")


if __name__ == "__main__":
    main()

Capture Scope

Capture uses your normal app command. There is no ReplayLab wrapper process in this production-style path.

uv run python tutorial_gemini_app.py

What good looks like:

<one response from Gemini>
ReplayLab capsule: .replaylab/capsules/<capsule_id>

Replay

Replay is local regression tooling. It runs the same application command under replaylab replay; when the request matches the capsule, ReplayLab serves the recorded Gemini response instead of calling the live provider.

uv run replaylab replay <capsule_id> \
  --local-store-root .replaylab \
  --auto-patch-integrations gemini \
  --report-id replay_tutorial_gemini \
  -- python tutorial_gemini_app.py

Compare

uv run replaylab report compare \
  <capsule_id> \
  .replaylab/replays/replay_tutorial_gemini/report.json \
  --local-store-root .replaylab

What good looks like:

Status: succeeded
Expected boundaries: 1
Replayed: 1
Problems: 0

Generate-Test

uv run replaylab generate-test <capsule_id> \
  --output tests/regression/test_tutorial_gemini_replay.py \
  --fixture-root tests/fixtures/replaylab/capsules \
  --app-root . \
  --auto-patch-integrations gemini \
  -- python tutorial_gemini_app.py

Run the generated test:

uv run pytest tests/regression/test_tutorial_gemini_replay.py

The generated test uses replaylab replay, asserts the replay report, and avoids a live Gemini call.

Local App

Start the app after capture or replay:

uv run replaylab app --local-store-root .replaylab

Open the captured run. The trace should show one Gemini LLM call, a provider chip labeled gemini, formatted input/output previews, a streaming chip when the call used a stream, attached regression replay evidence after replay, and generated guard evidence after generation.

Maintainer Loopback Scenario

Maintainers can validate the same loop without a Gemini API key:

python scripts/run_scenario.py run gemini-local --keep-workspace
python scripts/run_scenario.py run gemini-streaming-local --keep-workspace

That scenario uses the real google-genai SDK against a local fake Gemini generateContent server, then captures, replays, compares, exports, generates a guard, runs pytest, and checks the local app trace shape. The streaming variant fully consumes deterministic Gemini chunks and replays the recorded chunk sequence.