Skip to main content
The OpenAI Responses adapter is our reference LLMAdapter implementation. It enables MeshAgent agents to use the OpenAI Responses API, handling streaming, tool calls, and model-specific settings.

Key features

  • Model defaults: Reads the model name from the constructor (model=) or the OPENAI_MODEL environment variable. Override per message by passing model in the chat payload; ChatBot forwards it to next().
  • Session context defaults: Creates AgentSessionContext(system_role=None) so system/developer prompts are driven by the caller or wrapper agent.
  • Tool bundling: Converts the supplied toolkits into OpenAI tool definitions (both standard JSON function tools and OpenAI-native tools like computer_use_preview, web_search_preview, image_generation).
  • Streaming support: Consumes the streaming response API, emitting events such as reasoning summaries, partial content, and tool call updates.
  • Parallel tool calls: Optionally enables OpenAI’s parallel_tool_calls setting (disabled automatically for models that do not support it).
  • Structured output: If output_schema is provided to next(), requests JSON schema output and validates the result.
  • Automatic compaction: Uses OpenAI Responses auto-compaction (context_management) by default.

Constructor parameters

Python
OpenAIResponsesAdapter(
    model: str = "gpt-5.2",
    parallel_tool_calls: Optional[bool] = None,
    client: Optional[AsyncOpenAI] = None,
    response_options: Optional[dict] = None,
    reasoning_effort: Optional[str] = None,
    provider: str = "openai",
    log_requests: bool = False,
    max_output_tokens: Optional[int] = 32000,
    context_management: Literal["auto", "standalone", "none"] = "auto",
    compaction_threshold: int = 200000,
)
  • model – default model name; can be overridden per message.
  • parallel_tool_calls – request parallel tool execution when supported.
  • client – reuse an existing AsyncOpenAI client; otherwise the adapter creates one via meshagent.openai.proxy.get_client.
  • response_options – extra parameters passed to responses.create.
  • reasoning_effort – populates the Responses API reasoning options.
  • provider – label emitted in telemetry and logs.
  • log_requests – when true, logs HTTP requests for debugging.
  • max_output_tokens – cap output tokens per response; also used when deciding whether to compact the context.
  • context_management – controls compaction behavior:
    • auto attaches context_management to each request and lets Responses handle compaction.
    • standalone uses manual responses.compact preflight in the adapter.
    • none disables both auto and manual compaction.
  • compaction_threshold – threshold used for compaction (compact_threshold in Responses context_management entries and manual preflight trigger in standalone mode).

Tool provider integration

The adapter includes several builders and tools for OpenAI native tools. Agents can use them directly, or override them with agent-specific wrappers that add persistence (for example, the ChatBot’s thread-aware image generation builder that saves partial/final images to room storage and updates the thread document).
  • Image generationImageGenerationConfig, ImageGenerationToolkitBuilder, ImageGenerationTool
  • Local shellLocalShellConfig, LocalShellToolkitBuilder, LocalShellTool
  • MCPMCPConfig, MCPToolkitBuilder, MCPTool
  • Web search previewWebSearchConfig, WebSearchToolkitBuilder, WebSearchTool
  • File Search - FileSearchTool
  • Code Interpreter - CodeInterpreterTool
  • Reasoning - ReasoningTool

Handling a turn

When next() is called it:
  1. Bundles tools - Collects the tools from your toolkits and packages them for OpenAI’s API
  2. Calls the model - Sends messages and tools to OpenAI’s API
  3. Handles responses - Processes text, tool calls, or structured output
  4. Executes tools - When the model requests tools, executes them and formats results
  5. Loops - Continues calling the model with tool results until it produces a final answer
  6. Returns Result - Gives you the final output (text or structured data)

Context compaction

OpenAIResponsesAdapter defaults to context_management="auto" and sends: context_management=[{"type":"compaction","compact_threshold":200000}] You can switch to:
  • context_management="standalone" to use manual responses.compact before a turn when usage crosses the threshold.
  • context_management="none" to disable compaction management.