CrewAI Compatibility

This tutorial validates ReplayLab with normal CrewAI custom-tool dispatch. ReplayLab captures the provider calls through OpenAI SDK instrumentation and records execution-tool evidence through the CrewAI tool dispatch path, without requiring replaylab.trace_tool(...) in the tool function.

The integration is evidence-only. It does not enforce tool policy, mock tool results, sandbox execution, replay CrewAI planning semantics, or control external systems used by prebuilt crewai_tools integrations.

Why This Matters

CrewAI applications commonly register custom Python tools with @tool(...) or BaseTool subclasses and let an Agent / Task / Crew decide when to call them. ReplayLab needs to prove that the local callable actually ran without asking users to wrap every tool.

The crewai auto-patch label records that dispatch boundary while preserving the original return value and exception behavior.

Run The Scenario

Run:

python scripts/run_scenario.py run crewai-tool-local --keep-workspace

Expected ending:

ReplayLab scenario passed.
Scenario: crewai-tool-local
Tier: loopback
Boundaries: 3
Providers: openai, execution_tool

ReplayLab creates a clean temporary virtual environment, installs the current checkout plus crewai, openai, and pytest, starts a deterministic OpenAI Chat Completions-compatible loopback provider for capture, runs a normal CrewAI Agent / Task / Crew tool loop, stops the endpoint before replay, exports the React viewer, generates a pytest provider replay guard, and runs that generated test.

App Shape

The generated scenario app initializes ReplayLab once before importing and constructing CrewAI objects, keeps normal CrewAI tool registration, and lets CrewAI call the tool:

import os
from pathlib import Path

os.environ.setdefault("CREWAI_TRACING_ENABLED", "false")
os.environ.setdefault("OTEL_SDK_DISABLED", "true")

import replaylab
from replaylab import CapturePayloadPolicy

handle = replaylab.init(
    project_name="crewai-tool-local",
    auto_patch_integrations=("openai", "httpx", "crewai"),
    capture_payload_policy=CapturePayloadPolicy.FULL,
)

from crewai import Agent, Crew, LLM, Task
from crewai.tools import tool


@tool("lookup_customer")
def lookup_customer(customer_id: str) -> str:
    """Return deterministic customer context for the supplied customer ID."""
    return f"customer={customer_id};tier=standard"


def run_crew(provider_base_url: str) -> str:
    """Run one normal CrewAI tool call loop through OpenAI Chat Completions."""
    llm = LLM(
        model="openai/gpt-5-mini",
        api_key=os.environ.get("OPENAI_API_KEY", "scenario-api-key"),
        base_url=f"{provider_base_url}/v1",
        temperature=0,
    )
    agent = Agent(
        role="Support triage",
        goal="Classify customer support priority.",
        backstory="Triage assistant.",
        tools=[lookup_customer],
        llm=llm,
        max_iter=3,
        verbose=False,
    )
    task = Task(
        description="Look up customer cus_123 and classify priority.",
        expected_output="A terse triage label.",
        agent=agent,
    )
    crew = Crew(agents=[agent], tasks=[task], verbose=False)
    return str(crew.kickoff())


with handle.capture("crewai_tool_agent"):
    print(run_crew(Path("openai_url.txt").read_text(encoding="utf-8").strip()))

Provider clients and framework objects should be constructed after replaylab.init(...) so OpenAI, HTTPX, and CrewAI dispatch hooks are installed. The tool function itself does not use ReplayLab decorators; the framework dispatch hook supplies execution-tool evidence.

What ReplayLab Captures

The scenario expects two OpenAI Chat Completions boundaries and one execution-tool boundary:

1. provider=openai resource=openai.chat.completions
2. provider=execution_tool resource=lookup_customer source=crewai_framework
3. provider=openai resource=openai.chat.completions

The execution-tool evidence includes the tool name, callable module and qualified name, safe app-relative source path and line when available, timestamps, duration, success/failure status, and argument names only. It does not record argument values, return values, prompt text, result bodies, CrewAI task text, provider payload bodies, headers, locals, source text, environment values, or absolute paths.

Replay mode verifies the execution-tool boundary and still runs the callable normally. ReplayLab does not serve fake tool results.

Prebuilt Tools

Prebuilt crewai_tools integrations often wrap external systems such as browsers, search APIs, file systems, or hosted services. ReplayLab treats those conservatively. If it cannot resolve an app-root callable, it may show limited framework evidence for review, but that evidence does not count as exact local Python callable control for safe workflow readiness.

Use replaylab.trace_tool(...) as the explicit fallback for unsupported CrewAI paths, unsupported frameworks, or naked provider SDK tool loops.