OpenAI Agents SDK Compatibility
This tutorial validates ReplayLab with normal OpenAI Agents SDK function-tool dispatch. ReplayLab
captures provider calls through the OpenAI Responses adapter and records execution-tool evidence
through the Agents SDK dispatch path, without requiring replaylab.trace_tool(...) in the tool
function.
The integration is evidence-only. It does not enforce tool policy, mock tool results, sandbox execution, or change OpenAI hosted-tool behavior.
Why This Matters
OpenAI Agents SDK applications usually register local Python tools with @function_tool or
FunctionTool, then let the framework dispatch those tools after a provider tool call. ReplayLab
needs to prove that the local callable actually ran without asking users to wrap every tool.
The openai_agents auto-patch label records that dispatch boundary while preserving the original
return value and exception behavior.
Run The Scenario
Run:
python scripts/run_scenario.py run openai-agents-tool-local --keep-workspace
Expected ending:
ReplayLab scenario passed.
Scenario: openai-agents-tool-local
Tier: loopback
Boundaries: 3
Providers: openai, execution_tool
ReplayLab creates a clean temporary virtual environment, installs the current checkout plus
openai-agents, openai, and pytest, starts a deterministic OpenAI Responses-compatible
loopback provider for capture, runs a normal Agents SDK tool loop, stops the endpoint before replay,
exports the React viewer, generates a pytest provider replay guard, and runs that generated test.
App Shape
The generated scenario app initializes ReplayLab once, keeps normal Agents SDK tool registration, and uses the framework runner as usual:
import replaylab
from agents import Agent, RunConfig, Runner, function_tool, set_default_openai_client
from openai import AsyncOpenAI
from replaylab import CapturePayloadPolicy
handle = replaylab.init(
project_name="openai-agents-tool-local",
auto_patch_integrations=("openai", "openai_agents"),
capture_payload_policy=CapturePayloadPolicy.FULL,
)
@function_tool
def lookup_customer(customer_id: str) -> str:
"""Return deterministic customer context for the supplied customer ID."""
return f"customer={customer_id};tier=standard"
set_default_openai_client(
AsyncOpenAI(base_url="http://127.0.0.1:.../v1", api_key="scenario-key")
)
agent = Agent(
name="Support triage",
instructions="Look up the requested customer and return a terse triage label.",
tools=[lookup_customer],
)
with handle.capture("openai_agents_tool_agent"):
result = await Runner.run(
starting_agent=agent,
input="Look up customer cus_123 and classify priority.",
run_config=RunConfig(model="gpt-5-mini", tracing_disabled=True),
)
Provider clients should still be constructed after replaylab.init(...) so OpenAI provider replay
wrappers can be installed. The tool function itself does not use ReplayLab decorators; the
framework dispatch hook supplies execution-tool evidence.
What ReplayLab Captures
The scenario expects two OpenAI Responses boundaries and one execution-tool boundary:
1. provider=openai resource=openai.responses
2. provider=execution_tool resource=lookup_customer source=openai_agents_framework
3. provider=openai resource=openai.responses
The execution-tool evidence includes the tool name, callable module and qualified name, safe app-relative source path and line when available, timestamps, duration, success/failure status, and argument names only. It does not record argument values, return values, locals, source text, raw schemas, provider payload bodies, headers, environment values, or absolute paths.
Replay mode verifies the execution-tool boundary and still runs the callable normally. ReplayLab does not serve fake tool results.
What Is Not Yet Supported
- OpenAI hosted tools as local Python execution evidence.
- Agent-as-tool delegation as local callable evidence unless nested local function-tool evidence is observed.
- Agents SDK streaming or non-Responses provider paths beyond the captured provider boundary.
- Tool enforcement, tool result mocking, or framework-native semantic graph replay.
Use replaylab.trace_tool(...) as the explicit fallback for unsupported frameworks or naked
provider SDK tool loops.