Skip to content

Support Matrix

This page describes the current support surface for capture, replay, comparison, and generated pytest provider replay guards. The published 0.1.0a4 package is the current public alpha.

Providers

Provider path Capture Replay Notes
OpenAI OpenAI().responses.create(...) yes yes Non-streaming sync Responses calls only.
OpenAI OpenAI().responses.create(..., stream=True) yes yes Event-preserving sync Responses streams; replayable after full stream consumption.
OpenAI OpenAI().responses.parse(...) yes yes Non-streaming sync Responses calls only.
OpenAI AsyncOpenAI().responses.create(...) yes yes Non-streaming async Responses calls only.
OpenAI AsyncOpenAI().responses.create(..., stream=True) yes yes Event-preserving async Responses streams; replayable after full stream consumption.
OpenAI AsyncOpenAI().responses.parse(...) yes yes Non-streaming async Responses calls only.
Anthropic Anthropic().messages.create(...) yes yes Non-streaming sync Messages calls only.
Anthropic Anthropic().messages.create(..., stream=True) yes yes Event-preserving sync Messages streams; replayable after full stream consumption.
Anthropic Anthropic().messages.stream(...) yes yes Event-preserving sync Messages stream helper with text_stream and final-message replay helpers.
Anthropic AsyncAnthropic().messages.create(...) yes yes Non-streaming async Messages calls only.
Anthropic AsyncAnthropic().messages.create(..., stream=True) yes yes Event-preserving async Messages streams; replayable after full stream consumption.
Anthropic AsyncAnthropic().messages.stream(...) yes yes Event-preserving async Messages stream helper with text_stream and final-message replay helpers.
Anthropic Anthropic().messages.with_raw_response.create(...).parse() yes yes Non-streaming sync raw-response Messages calls only.
Gemini Client().models.generate_content(...) yes yes Non-streaming sync Google Gen AI calls only.
Gemini Client().models.generate_content_stream(...) yes yes Event-preserving sync Gemini chunks; replayable after full stream consumption.
Gemini Client().aio.models.generate_content(...) yes yes Non-streaming async Google Gen AI calls only.
Gemini Client().aio.models.generate_content_stream(...) yes yes Event-preserving async Gemini chunks; replayable after full stream consumption.
requests sync calls yes yes Session and common module-level helpers.
httpx.Client sync calls yes yes Client and common module-level helpers.
httpx.AsyncClient async calls yes yes Async request and common convenience methods.

HTTP request identity includes provider, method, canonical URL, query params, safe header names, and supported body content. Query params may be mappings or ordered two-item pair sequences. Body capture supports one of json=..., text or bytes data=..., or text or bytes content=....

Capture And Replay Modes

Capability Status
Same-process startup instrumentation with init(... auto_patch_integrations="auto") supported
Explicit provider patch tuples such as ("openai", "requests") supported
Request/job/session capture scopes with handle.capture(...) supported
ASGI/FastAPI request lifecycle helper with instrument_app(...) supported
Direct ASGI/FastAPI middleware registration with ReplayLabASGIMiddleware supported
Framework-agnostic worker/job decorator with capture_job(...) supported
PydanticAI Agent with OpenAI Responses model path validated by loopback scenario
LangGraph StateGraph nodes with supported provider calls validated by loopback scenario; local app groups provider boundaries under inferred graph nodes when safe callsite metadata is available
External LangGraph project path with supported provider calls validated by loopback realism scenario; provider-backed path is forced for replayable evidence
LangChain ChatOpenAI(..., use_responses_api=True) path validated by loopback scenario
Native Anthropic Messages SDK path validated by loopback scenarios, including streaming
Native Gemini Google Gen AI SDK path validated by loopback scenarios, including streaming
Python child-process auto-patching through replaylab run and replaylab replay supported for local/CI/generated-test workflows
Full-payload replay for succeeded provider calls required
Metadata-only capsule inspection supported
Metadata-only provider response replay not supported
Report-to-report diffing with replaylab report diff supported
Advisory report-diff explanation with replaylab report diff-explain supported on main; optional BYOK AI
Interactive local app workflow execution with replaylab app supported on main for existing capsules
Replay safety preflight and project effect policy review for captured-run and report details supported on main; provider replay boundaries, OpenAI model-tool protocol evidence, advisory Python implementation candidates, explicit execution-tool wrapper evidence, sanitized HTTP effect stack attribution, advisory tool effect maps, read-only effect policy proposals, saved project review decisions, opt-in HTTP effect control evidence, opt-in local-effect control evidence, opt-in SQLite database-effect control evidence, opt-in raw-socket network-effect control evidence, opt-in queue/pubsub effect control evidence, opt-in unsupported HTTP client control evidence, unsupported-effect scope detection including native/FFI and process-escape signals, and safe workflow readiness
Opt-in HTTP effect policy control for requests and httpx supported on main; observe is the default, enforce checks sanitized HTTP evidence against accepted project policy rules and fails closed for missing, unmatched, unaccepted, or ambiguous rules
Opt-in local-effect control for filesystem mutations and subprocess launches supported on main; observe records only when local_effects hooks are explicitly requested, enforce installs hooks automatically in child run/replay workflows and fails closed before app-origin file mutations or subprocess launches
Opt-in SQLite database-effect control for sqlite3 and sync SQLAlchemy SQLite/pysqlite supported on main; observe records only when database_effects hooks are explicitly requested, enforce installs hooks automatically in child run/replay workflows and fails closed before statements without accepted exact statement-shape policy
Opt-in raw-socket network-effect control for direct Python socket escapes supported on main; observe records only when network_effects hooks are explicitly requested, enforce installs hooks automatically in child run/replay workflows and fails closed before app-origin raw socket connect/send I/O
Opt-in queue/pubsub effect control for representative synchronous enqueue/publish APIs supported on main; observe records only when queue_effects hooks are explicitly requested, enforce installs hooks automatically in child run/replay workflows and fails closed before app-origin Celery, RQ, Dramatiq, Kombu, Pika, Kafka Python, or Confluent Kafka enqueue/publish broker I/O
Opt-in unsupported HTTP client control for urllib, urllib3, and aiohttp escapes supported on main; observe records only when unsupported_http_clients hooks are explicitly requested, enforce installs hooks automatically in child run/replay workflows and fails closed before app-origin unsupported HTTP client network I/O; this does not replay or mock those client responses
Opt-in sandboxed regression replay supported on main with local Docker container backend only; replaylab sandbox build-image prepares the default local runtime image or a bounded recipe-backed image, replaylab sandbox doctor checks Docker/image readiness with structured sanitized diagnostics and hardened runtime flags, and enforce copies the app/store/capsule/ReplayLab sources into a temporary workspace, runs replay as non-root with deny-all network, read-only root filesystem, split read-only/writable mounts, dropped capabilities, no new privileges, resource limits, bounded tmpfs /tmp, and no Docker socket, and records secret-safe sandbox evidence; sandbox-adversarial-local validates bounded escape probes for developer confidence
Guided local workflow with replaylab workflow local supported on main for existing capsules
Local React replay viewer launch with replaylab report view report supported
Local React replay diff viewer launch with replaylab report view diff supported
Non-opening React viewer export with replaylab report export-viewer report\|diff supported
Static local HTML replay report export with replaylab report export-html supported
Static local HTML replay report diff export with replaylab report diff-html supported
Opt-in failed-boundary pytest provider replay guard generation for final provider failures supported
Safe workflow regression with controlled execution tools and I/O supported narrowly for report-derived workflows that pass provider replay, explicit execution-tool wrapper evidence, reviewed/enforced HTTP policy, enforced local-effect hooks, reviewed/enforced SQLite database policy when SQLite statements exist, enforced raw-socket network hooks, enforced queue/pubsub hooks, enforced unsupported HTTP client escape hooks, completed hardened local-container sandbox containment, and unsupported-effect scope checks; non-SQLite DBs, unsupported queue/pubsub SDKs, workflows that depend on unsupported HTTP client responses, linked native/FFI or process-escape evidence, missing/failed/old sandbox evidence, and VM/microVM or managed-hosted sandbox guarantees remain blockers
Automatic same-process replay startup for long-lived apps not implemented
Wrapper and child capsule merge not implemented

The ASGI middleware is framework-neutral and FastAPI/Starlette-compatible. It captures HTTP request lifecycle metadata and provider boundaries, but it does not capture framework request or response bodies. Scenario coverage now checks ignored paths, provider-free requests, route path and endpoint metadata, configured request IDs, and authorization/cookie omission.

auto_patch_integrations="auto" means all supported provider patchers in stable order: OpenAI, Anthropic, Gemini, requests, and httpx. Missing provider imports are no-ops. Use explicit tuples when teams want to narrow the patch surface.

The worker/job decorator is framework-neutral. It opens one capture scope per decorated function call and records safe job metadata, but it does not capture job args, kwargs, return values, queue payloads, or Celery/RQ/APScheduler internals. Scenario coverage now checks provider-free jobs, sync and async provider jobs, and session-ID extraction from both positional and keyword calls.

PydanticAI, LangGraph, and LangChain coverage is scenario-level compatibility, not native framework tracing. ReplayLab validates that supported provider calls inside those frameworks can be captured and replayed when the application initializes ReplayLab before constructing and using provider clients. The PydanticAI scenario covers OpenAIResponsesModel with OpenAIProvider(openai_client=...). LangGraph coverage includes both local-source and external project realism (langgraph-example) with StateGraph node work using requests and OpenAI Responses. Provider-backed LangGraph traces show inferred graph-node grouping in the local app from secret-safe callsite metadata. Provider-free LangGraph runs remain allowed as metadata captures, but the app labels them as non-replayable evidence instead of offering replay or experiment actions. The LangChain scenario covers ChatOpenAI(..., use_responses_api=True) over OpenAI Responses. Anthropic and Gemini support is native provider-SDK support: ReplayLab captures and replays non-streaming Anthropic Messages and Gemini generate_content calls, plus core SDK streaming calls that are fully consumed. The local app shows formatted request/response previews, a streaming chip, chunk/event counts, and collapsed raw event payloads before debug JSON.

Replay safety preflight is read-only in this slice. It marks captured provider LLM boundaries as controlled by provider replay, extracts OpenAI model tool declarations plus provider-protocol tool requests/results, and can statically resolve likely local Python implementation candidates when a safe app root is available. Candidate rows are advisory code-location evidence only. Applications can opt into explicit execution-tool control evidence with replaylab.control_tool(...) or handle.control_tool(...); that proves wrapper-mediated callable execution, not sandboxing or full workflow safety. For captured requests and httpx calls, the preflight also shows sanitized HTTP effect stack attribution with the nearest user-code origin when available. When source and stack evidence match, the preflight can also show an advisory tool effect map linking the model tool, likely candidate callable, and observed HTTP effect. Execution-control, HTTP attribution, and effect maps omit source text, locals, argument values, return values, headers, payload bodies, environment values, and absolute paths. ReplayLab can create read-only effect policy proposal items for future review and can save project-scoped review decisions under .replaylab/app/effect-policies/. Saved project rules are enforcement input only when HTTP effect policy mode or SQLite database effect control mode is explicitly set to enforce. Local-effect control can also record or block app-origin filesystem mutations and subprocess launches when its hooks are active. Raw-socket network-effect control can record or fail closed on direct socket connect/send escapes. Queue/PubSub effect control can record or fail closed on supported enqueue/publish APIs. Unsupported HTTP client control can fail closed on urllib, urllib3, and aiohttp escapes, but it does not replay or mock those responses. Sandboxed regression replay can run the replay child in a local Docker container with deny-all network and a copied workspace, but it requires Docker plus a prepared image that already contains the runtime dependencies because V1 does not install dependencies inside the network-denied container. ReplayLab does not infer side-effect class from provider tool names and does not control non-SQLite databases, unsupported queue/pubsub SDKs, native/FFI escapes, cross-process escapes, broad OS effects, or VM/microVM isolation guarantees. Safe workflow regression generation is supported only from report preflights whose readiness gate reaches ready; all other artifacts continue to generate provider replay guards or diagnostic provider replay guards.

Not Supported In The Current MVP

  • OpenAI Chat Completions.
  • OpenAI streaming helper APIs beyond responses.create(..., stream=True).
  • Anthropic batch/files APIs, Bedrock/Vertex clients, OpenAI-compatible Anthropic routing, and streaming helper paths beyond messages.create(..., stream=True) / messages.stream(...).
  • Gemini multimodal file/image/video flows, Live API, Vertex-specific product breadth, and streaming paths beyond generate_content_stream(...).
  • LangChain and framework wrappers that require OpenAI Chat Completions semantics.
  • Framework-native PydanticAI or LangGraph streaming semantics beyond their underlying captured provider boundaries.
  • Framework-native semantic graph visualization or graph-edge replay semantics beyond provider boundaries and safe inferred grouping.
  • HTTP file uploads, multipart bodies, streaming uploads, and streaming downloads.
  • Header value capture by default. HTTP matching captures header names only.
  • Broad non-HTTP I/O enforcement. Local-effect control is opt-in and limited to app-origin filesystem mutations and subprocess launches; database control is opt-in and limited to SQLite exact statement-shape policy; raw-socket network control is opt-in and fail-closed without allowlists; queue/pubsub control is opt-in and limited to representative synchronous enqueue/publish APIs without broker replay or worker execution; unsupported HTTP client control is opt-in and fail-closed without response replay or mocking; non-SQLite databases, unsupported queue/pubsub SDKs, linked native/FFI and process-escape paths, and other effects are not controlled.
  • Managed hosted sandboxing, VM/microVM isolation guarantees, and Daytona-backed execution. Sandboxed replay is local Docker only in V1; the built-in image builder prepares local Docker images only, with bounded recipes for local path dependencies and package-name-only apt installs.
  • Cloud hosted runners, issue grouping, account auth, billing, and team collaboration.
  • Auto-sync scope customization beyond the current fixed everything_generated scope.
  • Hosted report UI beyond tokenized private artifact/share-link pages.
  • Safe workflow regression for broad database backends, unsupported queue/pubsub SDKs, workflows that require unsupported HTTP client response replay, linked native/FFI or process-escape paths, missing/failed/old local-container sandbox evidence, or sandbox guarantees beyond local Docker containment.
  • Broad failed-flow generation beyond final failed provider boundaries.
  • Hard Celery, RQ, or APScheduler integrations.
  • Perturbation mode.

See MVP Limitations for the rationale behind these boundaries.