Text Models
OpenAI Multimodal Responses API
Create responses using the OpenAI Responses API format, with support for multimodal input, tool calling, streaming output, and context compression.
POST
OpenAI Multimodal Responses API
The Responses API is a unified response interface for multimodal workflows, tool calling, and context continuation. Compared with Chat Completions, its input, output, and tool-calling structure is better suited for complex task orchestration.Request Body
Model name.
User input. Can be a string or a structured message array.
Developer or system-level instructions.
The ID of the previous response. Can be used for context continuation when supported by the upstream.
Tool list. Supports function tools, and can also forward upstream-compatible built-in tools.
Tool selection strategy.
Maximum number of output tokens. Explicitly passing
0 will be preserved and forwarded to supported upstreams.Reasoning configuration, for example
{ "effort": "medium", "summary": "auto" }.Text output configuration, commonly used for JSON Schema structured output.
Whether to enable SSE streaming output.
Whether to include token usage in streaming responses.
Controls whether the upstream stores the request and response. This field is allowed to be forwarded by default, and can be disabled by channel settings.
Additional business-side metadata.
Additional fields to include in the request response; the specific values depend on the upstream implementation.
Request Example
Multimodal Input
Tool Calling
Streaming Output
Response Example
Context Compression
/v1/responses, and the commonly used fields are model, input, instructions, and previous_response_id.