Skip to main content
The OpenAI Responses adapter is our reference LLMAdapter implementation. It enables MeshAgent agents to use the OpenAI Responses API, handling streaming, tool calls, and model-specific settings.

Key features

  • Model defaults: Reads the model name from the constructor (model=) or the OPENAI_MODEL environment variable. Override per message by passing model in the chat payload; ChatBot forwards it to next().
  • Chat context defaults: Creates AgentChatContext(system_role=None) so system/developer prompts are driven by the caller or wrapper agent.
  • Tool bundling: Converts the supplied toolkits into OpenAI tool definitions (both standard JSON function tools and OpenAI-native tools like computer_use_preview, web_search_preview, image_generation).
  • Streaming support: Consumes the streaming response API, emitting events such as reasoning summaries, partial content, and tool call updates.
  • Parallel tool calls: Optionally enables OpenAI’s parallel_tool_calls setting (disabled automatically for models that do not support it).
  • Structured output: If output_schema is provided to next(), requests JSON schema output and validates the result.
  • Automatic compaction: Uses the OpenAI Responses compaction API to shrink the context when the next request would exceed the model window.

Constructor parameters

Python
OpenAIResponsesAdapter(
    model: str = "gpt-5.2",
    parallel_tool_calls: Optional[bool] = None,
    client: Optional[AsyncOpenAI] = None,
    response_options: Optional[dict] = None,
    reasoning_effort: Optional[str] = None,
    provider: str = "openai",
    log_requests: bool = False,
    max_output_tokens: Optional[int] = 32000,
)
  • model – default model name; can be overridden per message.
  • parallel_tool_calls – request parallel tool execution when supported.
  • client – reuse an existing AsyncOpenAI client; otherwise the adapter creates one via meshagent.openai.proxy.get_client.
  • response_options – extra parameters passed to responses.create.
  • reasoning_effort – populates the Responses API reasoning options.
  • provider – label emitted in telemetry and logs.
  • log_requests – when true, logs HTTP requests for debugging.
  • max_output_tokens – cap output tokens per response; also used when deciding whether to compact the context.

Tool provider integration

The adapter includes several builders and tools for OpenAI native tools. Agents can use them directly, or override them with agent-specific wrappers that add persistence (for example, the ChatBot’s thread-aware image generation builder that saves partial/final images to room storage and updates the thread document).
  • Image generationImageGenerationConfig, ImageGenerationToolkitBuilder, ImageGenerationTool
  • Local shellLocalShellConfig, LocalShellToolkitBuilder, LocalShellTool
  • MCPMCPConfig, MCPToolkitBuilder, MCPTool
  • Web search previewWebSearchConfig, WebSearchToolkitBuilder, WebSearchTool
  • File Search - FileSearchTool
  • Code Interpreter - CodeInterpreterTool
  • Reasoning - ReasoningTool

Handling a turn

When next() is called it:
  1. Bundles tools - Collects the tools from your toolkits and packages them for OpenAI’s API
  2. Calls the model - Sends messages and tools to OpenAI’s API
  3. Handles responses - Processes text, tool calls, or structured output
  4. Executes tools - When the model requests tools, executes them and formats results
  5. Loops - Continues calling the model with tool results until it produces a final answer
  6. Returns Result - Gives you the final output (text or structured data)

Context compaction

OpenAIResponsesAdapter stores token usage from the last response in the chat context metadata. Before issuing the next request it checks whether the prior input + output usage would overflow the model’s context window (while reserving max_output_tokens for the reply). If so, it calls responses.compact, replaces the context messages with the compacted output, and then proceeds with the new request. This is automatic; you do not need to call compaction manually.