Tutorial: Capture And Replay Gemini Generate Content
This tutorial captures a real Google Gen AI Gemini generate_content call from normal
Python app startup, replays it locally, compares the replay report, and generates a pytest
regression.
ReplayLab's Gemini support is native provider-SDK support. It wraps
Client().models.generate_content(...), Client().aio.models.generate_content(...), and
models.generate_content_stream(...) for sync and async clients. Multimodal file/image/video
flows, Gemini Live, and Vertex-specific product breadth are not supported in this slice.
Setup
Install the packages and set GEMINI_API_KEY.
Do not paste secret values into tutorial files, notebooks, docs, or terminal output.
python -m venv .venv
source .venv/bin/activate
pip install replaylab google-genai
export GEMINI_API_KEY="..."
For local repo development before package publication:
uv sync --all-packages --all-groups
uv pip install google-genai
export GEMINI_API_KEY="..."
Startup Instrumentation
Create tutorial_gemini_app.py.
The app keeps normal Google Gen AI provider code.
ReplayLab setup happens near startup, before the Gemini client is constructed.
import os
import replaylab
from google import genai
from replaylab import CapturePayloadPolicy
MODEL = os.environ.get("REPLAYLAB_TUTORIAL_GEMINI_MODEL", "gemini-2.5-flash")
def call_model() -> str:
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
response = client.models.generate_content(
model=MODEL,
contents="Explain why deterministic replay helps agent regression tests.",
)
return response.text or ""
def call_model_streaming() -> str:
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
chunks = client.models.generate_content_stream(
model=MODEL,
contents="Explain why deterministic replay helps agent regression tests.",
)
return "".join(chunk.text or "" for chunk in chunks)
def main() -> None:
handle = replaylab.init(
project_name="tutorial-gemini",
auto_patch_integrations=("gemini",),
capture_payload_policy=CapturePayloadPolicy.FULL,
)
with handle.capture(
"gemini_generate_content_tutorial",
labels=("tutorial", "gemini"),
runtime_metadata={"gemini.model": MODEL},
) as capture:
print(call_model())
if capture.capsule is not None:
print(f"ReplayLab capsule: {capture.capsule.capsule_path}")
if __name__ == "__main__":
main()
Capture Scope
Capture uses your normal app command. There is no ReplayLab wrapper process in this production-style path.
uv run python tutorial_gemini_app.py
What good looks like:
<one response from Gemini>
ReplayLab capsule: .replaylab/capsules/<capsule_id>
Replay
Replay is local regression tooling.
It runs the same application command under replaylab replay; when the request matches the capsule,
ReplayLab serves the recorded Gemini response instead of calling the live provider.
uv run replaylab replay <capsule_id> \
--local-store-root .replaylab \
--auto-patch-integrations gemini \
--report-id replay_tutorial_gemini \
-- python tutorial_gemini_app.py
Compare
uv run replaylab report compare \
<capsule_id> \
.replaylab/replays/replay_tutorial_gemini/report.json \
--local-store-root .replaylab
What good looks like:
Status: succeeded
Expected boundaries: 1
Replayed: 1
Problems: 0
Generate-Test
uv run replaylab generate-test <capsule_id> \
--output tests/regression/test_tutorial_gemini_replay.py \
--fixture-root tests/fixtures/replaylab/capsules \
--app-root . \
--auto-patch-integrations gemini \
-- python tutorial_gemini_app.py
Run the generated test:
uv run pytest tests/regression/test_tutorial_gemini_replay.py
The generated test uses replaylab replay, asserts the replay report, and avoids a live Gemini
call.
Local App
Start the app after capture or replay:
uv run replaylab app --local-store-root .replaylab
Open the captured run. The trace should show one Gemini LLM call, a provider chip labeled
gemini, formatted input/output previews, a streaming chip when the call used a stream, attached
regression replay evidence after replay, and generated guard evidence after generation.
Maintainer Loopback Scenario
Maintainers can validate the same loop without a Gemini API key:
python scripts/run_scenario.py run gemini-local --keep-workspace
python scripts/run_scenario.py run gemini-streaming-local --keep-workspace
That scenario uses the real google-genai SDK against a local fake Gemini generateContent server,
then captures, replays, compares, exports, generates a guard, runs pytest, and checks the local app
trace shape. The streaming variant fully consumes deterministic Gemini chunks and replays the
recorded chunk sequence.