Support Matrix
This page describes the current support surface for capture, replay, comparison, and generated
pytest provider replay guards. The published 0.1.0a4 package is the current public alpha.
Providers
| Provider path | Capture | Replay | Notes |
|---|---|---|---|
OpenAI OpenAI().responses.create(...) |
yes | yes | Non-streaming sync Responses calls only. |
OpenAI OpenAI().responses.create(..., stream=True) |
yes | yes | Event-preserving sync Responses streams; replayable after full stream consumption. |
OpenAI OpenAI().responses.parse(...) |
yes | yes | Non-streaming sync Responses calls only. |
OpenAI AsyncOpenAI().responses.create(...) |
yes | yes | Non-streaming async Responses calls only. |
OpenAI AsyncOpenAI().responses.create(..., stream=True) |
yes | yes | Event-preserving async Responses streams; replayable after full stream consumption. |
OpenAI AsyncOpenAI().responses.parse(...) |
yes | yes | Non-streaming async Responses calls only. |
Anthropic Anthropic().messages.create(...) |
yes | yes | Non-streaming sync Messages calls only. |
Anthropic Anthropic().messages.create(..., stream=True) |
yes | yes | Event-preserving sync Messages streams; replayable after full stream consumption. |
Anthropic Anthropic().messages.stream(...) |
yes | yes | Event-preserving sync Messages stream helper with text_stream and final-message replay helpers. |
Anthropic AsyncAnthropic().messages.create(...) |
yes | yes | Non-streaming async Messages calls only. |
Anthropic AsyncAnthropic().messages.create(..., stream=True) |
yes | yes | Event-preserving async Messages streams; replayable after full stream consumption. |
Anthropic AsyncAnthropic().messages.stream(...) |
yes | yes | Event-preserving async Messages stream helper with text_stream and final-message replay helpers. |
Anthropic Anthropic().messages.with_raw_response.create(...).parse() |
yes | yes | Non-streaming sync raw-response Messages calls only. |
Gemini Client().models.generate_content(...) |
yes | yes | Non-streaming sync Google Gen AI calls only. |
Gemini Client().models.generate_content_stream(...) |
yes | yes | Event-preserving sync Gemini chunks; replayable after full stream consumption. |
Gemini Client().aio.models.generate_content(...) |
yes | yes | Non-streaming async Google Gen AI calls only. |
Gemini Client().aio.models.generate_content_stream(...) |
yes | yes | Event-preserving async Gemini chunks; replayable after full stream consumption. |
requests sync calls |
yes | yes | Session and common module-level helpers. |
httpx.Client sync calls |
yes | yes | Client and common module-level helpers. |
httpx.AsyncClient async calls |
yes | yes | Async request and common convenience methods. |
HTTP request identity includes provider, method, canonical URL, query params, safe header names, and
supported body content. Query params may be mappings or ordered two-item pair sequences. Body capture
supports one of json=..., text or bytes data=..., or text or bytes content=....
Capture And Replay Modes
| Capability | Status |
|---|---|
Same-process startup instrumentation with init(... auto_patch_integrations="auto") |
supported |
Explicit provider patch tuples such as ("openai", "requests") |
supported |
Request/job/session capture scopes with handle.capture(...) |
supported |
ASGI/FastAPI request lifecycle helper with instrument_app(...) |
supported |
Direct ASGI/FastAPI middleware registration with ReplayLabASGIMiddleware |
supported |
Framework-agnostic worker/job decorator with capture_job(...) |
supported |
| PydanticAI Agent with OpenAI Responses model path | validated by loopback scenario |
LangGraph StateGraph nodes with supported provider calls |
validated by loopback scenario; local app groups provider boundaries under inferred graph nodes when safe callsite metadata is available |
| External LangGraph project path with supported provider calls | validated by loopback realism scenario; provider-backed path is forced for replayable evidence |
LangChain ChatOpenAI(..., use_responses_api=True) path |
validated by loopback scenario |
| Native Anthropic Messages SDK path | validated by loopback scenarios, including streaming |
| Native Gemini Google Gen AI SDK path | validated by loopback scenarios, including streaming |
Python child-process auto-patching through replaylab run and replaylab replay |
supported for local/CI/generated-test workflows |
| Full-payload replay for succeeded provider calls | required |
| Metadata-only capsule inspection | supported |
| Metadata-only provider response replay | not supported |
Report-to-report diffing with replaylab report diff |
supported |
Advisory report-diff explanation with replaylab report diff-explain |
supported on main; optional BYOK AI |
Interactive local app workflow execution with replaylab app |
supported on main for existing capsules |
| Replay safety preflight and project effect policy review for captured-run and report details | supported on main; provider replay boundaries, OpenAI model-tool protocol evidence, advisory Python implementation candidates, explicit execution-tool wrapper evidence, sanitized HTTP effect stack attribution, advisory tool effect maps, read-only effect policy proposals, saved project review decisions, opt-in HTTP effect control evidence, opt-in local-effect control evidence, opt-in SQLite database-effect control evidence, opt-in raw-socket network-effect control evidence, opt-in queue/pubsub effect control evidence, opt-in unsupported HTTP client control evidence, unsupported-effect scope detection including native/FFI and process-escape signals, and safe workflow readiness |
Opt-in HTTP effect policy control for requests and httpx |
supported on main; observe is the default, enforce checks sanitized HTTP evidence against accepted project policy rules and fails closed for missing, unmatched, unaccepted, or ambiguous rules |
| Opt-in local-effect control for filesystem mutations and subprocess launches | supported on main; observe records only when local_effects hooks are explicitly requested, enforce installs hooks automatically in child run/replay workflows and fails closed before app-origin file mutations or subprocess launches |
Opt-in SQLite database-effect control for sqlite3 and sync SQLAlchemy SQLite/pysqlite |
supported on main; observe records only when database_effects hooks are explicitly requested, enforce installs hooks automatically in child run/replay workflows and fails closed before statements without accepted exact statement-shape policy |
| Opt-in raw-socket network-effect control for direct Python socket escapes | supported on main; observe records only when network_effects hooks are explicitly requested, enforce installs hooks automatically in child run/replay workflows and fails closed before app-origin raw socket connect/send I/O |
| Opt-in queue/pubsub effect control for representative synchronous enqueue/publish APIs | supported on main; observe records only when queue_effects hooks are explicitly requested, enforce installs hooks automatically in child run/replay workflows and fails closed before app-origin Celery, RQ, Dramatiq, Kombu, Pika, Kafka Python, or Confluent Kafka enqueue/publish broker I/O |
Opt-in unsupported HTTP client control for urllib, urllib3, and aiohttp escapes |
supported on main; observe records only when unsupported_http_clients hooks are explicitly requested, enforce installs hooks automatically in child run/replay workflows and fails closed before app-origin unsupported HTTP client network I/O; this does not replay or mock those client responses |
| Opt-in sandboxed regression replay | supported on main with local Docker container backend only; replaylab sandbox build-image prepares the default local runtime image or a bounded recipe-backed image, replaylab sandbox doctor checks Docker/image readiness with structured sanitized diagnostics and hardened runtime flags, and enforce copies the app/store/capsule/ReplayLab sources into a temporary workspace, runs replay as non-root with deny-all network, read-only root filesystem, split read-only/writable mounts, dropped capabilities, no new privileges, resource limits, bounded tmpfs /tmp, and no Docker socket, and records secret-safe sandbox evidence; sandbox-adversarial-local validates bounded escape probes for developer confidence |
Guided local workflow with replaylab workflow local |
supported on main for existing capsules |
Local React replay viewer launch with replaylab report view report |
supported |
Local React replay diff viewer launch with replaylab report view diff |
supported |
Non-opening React viewer export with replaylab report export-viewer report\|diff |
supported |
Static local HTML replay report export with replaylab report export-html |
supported |
Static local HTML replay report diff export with replaylab report diff-html |
supported |
| Opt-in failed-boundary pytest provider replay guard generation for final provider failures | supported |
| Safe workflow regression with controlled execution tools and I/O | supported narrowly for report-derived workflows that pass provider replay, explicit execution-tool wrapper evidence, reviewed/enforced HTTP policy, enforced local-effect hooks, reviewed/enforced SQLite database policy when SQLite statements exist, enforced raw-socket network hooks, enforced queue/pubsub hooks, enforced unsupported HTTP client escape hooks, completed hardened local-container sandbox containment, and unsupported-effect scope checks; non-SQLite DBs, unsupported queue/pubsub SDKs, workflows that depend on unsupported HTTP client responses, linked native/FFI or process-escape evidence, missing/failed/old sandbox evidence, and VM/microVM or managed-hosted sandbox guarantees remain blockers |
| Automatic same-process replay startup for long-lived apps | not implemented |
| Wrapper and child capsule merge | not implemented |
The ASGI middleware is framework-neutral and FastAPI/Starlette-compatible. It captures HTTP request lifecycle metadata and provider boundaries, but it does not capture framework request or response bodies. Scenario coverage now checks ignored paths, provider-free requests, route path and endpoint metadata, configured request IDs, and authorization/cookie omission.
auto_patch_integrations="auto" means all supported provider patchers in stable order: OpenAI,
Anthropic, Gemini, requests, and httpx. Missing provider imports are no-ops. Use explicit tuples
when teams want to narrow the patch surface.
The worker/job decorator is framework-neutral. It opens one capture scope per decorated function call and records safe job metadata, but it does not capture job args, kwargs, return values, queue payloads, or Celery/RQ/APScheduler internals. Scenario coverage now checks provider-free jobs, sync and async provider jobs, and session-ID extraction from both positional and keyword calls.
PydanticAI, LangGraph, and LangChain coverage is scenario-level compatibility, not native framework
tracing. ReplayLab validates that supported provider calls inside those frameworks can be captured
and replayed when the application initializes ReplayLab before constructing and using provider
clients. The PydanticAI scenario covers OpenAIResponsesModel with
OpenAIProvider(openai_client=...). LangGraph coverage includes both local-source and external
project realism (langgraph-example) with StateGraph node work using requests and OpenAI
Responses. Provider-backed LangGraph traces show inferred graph-node grouping in the local app from
secret-safe callsite metadata. Provider-free LangGraph runs remain allowed as metadata captures, but
the app labels them as non-replayable evidence instead of offering replay or experiment actions. The
LangChain scenario covers ChatOpenAI(..., use_responses_api=True) over OpenAI Responses.
Anthropic and Gemini support is native provider-SDK support: ReplayLab captures and replays
non-streaming Anthropic Messages and Gemini generate_content calls, plus core SDK streaming calls
that are fully consumed. The local app shows formatted request/response previews, a streaming chip,
chunk/event counts, and collapsed raw event payloads before debug JSON.
Replay safety preflight is read-only in this slice. It marks captured provider LLM boundaries as
controlled by provider replay, extracts OpenAI model tool declarations plus provider-protocol tool
requests/results, and can statically resolve likely local Python implementation candidates when a
safe app root is available. Candidate rows are advisory code-location evidence only. Applications
can opt into explicit execution-tool control evidence with replaylab.control_tool(...) or
handle.control_tool(...); that proves wrapper-mediated callable execution, not sandboxing or full
workflow safety. For captured requests and httpx calls, the preflight also shows sanitized HTTP
effect stack attribution with the nearest user-code origin when available. When source and stack
evidence match, the preflight can also show an advisory tool effect map linking the model tool,
likely candidate callable, and observed HTTP effect. Execution-control, HTTP attribution, and effect
maps omit source text, locals, argument values, return values, headers, payload bodies, environment
values, and absolute paths. ReplayLab can create read-only effect policy proposal items for future
review and can save project-scoped review decisions under .replaylab/app/effect-policies/.
Saved project rules are enforcement input only when HTTP effect policy mode or SQLite database
effect control mode is explicitly set to enforce. Local-effect control can also record or block
app-origin filesystem mutations and subprocess launches when its hooks are active. Raw-socket
network-effect control can record or fail closed on direct socket connect/send escapes. Queue/PubSub
effect control can record or fail closed on supported enqueue/publish APIs. Unsupported HTTP client
control can fail closed on urllib, urllib3, and aiohttp escapes, but it does not replay or
mock those responses. Sandboxed regression replay can run the replay child in a local Docker
container with deny-all network and a copied workspace, but it requires Docker plus a prepared image
that already contains the runtime dependencies because V1 does not install dependencies inside the
network-denied container. ReplayLab does not infer side-effect class from provider tool names and
does not control non-SQLite databases, unsupported queue/pubsub SDKs, native/FFI escapes,
cross-process escapes, broad OS effects, or VM/microVM isolation guarantees. Safe workflow
regression generation is supported only from report preflights whose readiness gate reaches
ready; all other artifacts continue to generate provider replay guards or diagnostic provider
replay guards.
Not Supported In The Current MVP
- OpenAI Chat Completions.
- OpenAI streaming helper APIs beyond
responses.create(..., stream=True). - Anthropic batch/files APIs, Bedrock/Vertex clients, OpenAI-compatible Anthropic routing, and
streaming helper paths beyond
messages.create(..., stream=True)/messages.stream(...). - Gemini multimodal file/image/video flows, Live API, Vertex-specific product breadth, and
streaming paths beyond
generate_content_stream(...). - LangChain and framework wrappers that require OpenAI Chat Completions semantics.
- Framework-native PydanticAI or LangGraph streaming semantics beyond their underlying captured provider boundaries.
- Framework-native semantic graph visualization or graph-edge replay semantics beyond provider boundaries and safe inferred grouping.
- HTTP file uploads, multipart bodies, streaming uploads, and streaming downloads.
- Header value capture by default. HTTP matching captures header names only.
- Broad non-HTTP I/O enforcement. Local-effect control is opt-in and limited to app-origin filesystem mutations and subprocess launches; database control is opt-in and limited to SQLite exact statement-shape policy; raw-socket network control is opt-in and fail-closed without allowlists; queue/pubsub control is opt-in and limited to representative synchronous enqueue/publish APIs without broker replay or worker execution; unsupported HTTP client control is opt-in and fail-closed without response replay or mocking; non-SQLite databases, unsupported queue/pubsub SDKs, linked native/FFI and process-escape paths, and other effects are not controlled.
- Managed hosted sandboxing, VM/microVM isolation guarantees, and Daytona-backed execution. Sandboxed replay is local Docker only in V1; the built-in image builder prepares local Docker images only, with bounded recipes for local path dependencies and package-name-only apt installs.
- Cloud hosted runners, issue grouping, account auth, billing, and team collaboration.
- Auto-sync scope customization beyond the current fixed
everything_generatedscope. - Hosted report UI beyond tokenized private artifact/share-link pages.
- Safe workflow regression for broad database backends, unsupported queue/pubsub SDKs, workflows that require unsupported HTTP client response replay, linked native/FFI or process-escape paths, missing/failed/old local-container sandbox evidence, or sandbox guarantees beyond local Docker containment.
- Broad failed-flow generation beyond final failed provider boundaries.
- Hard Celery, RQ, or APScheduler integrations.
- Perturbation mode.
See MVP Limitations for the rationale behind these boundaries.