Text Models
Gemini Native Format
Use Google Gemini native paths and request bodies to call generateContent, streamGenerateContent, and model queries.
POST
Gemini Native Format
Gemini Native Format preserves the Google Gemini API paths and request bodies. It is suitable for business integrations that already use the Gemini SDK, thecontents/parts structure, or safety settings configuration.
Authentication
Paths
| Method | Path | Description |
|---|---|---|
GET | /v1beta/models | Gemini model list |
POST | /v1beta/models/{model}:generateContent | Non-streaming content generation |
POST | /v1beta/models/{model}:streamGenerateContent | Streaming content generation |
Request Body
List of conversation contents. Each content item contains
role and parts.Content parts. Supports
text, inlineData, fileData, functionCall, functionResponse, and more.System-level instruction (System Prompt). Used to define model behavior, role setting, and response style. Compatible with both
systemInstruction and system_instruction forms.Generation configuration, used to control model output behavior. Supports the following fields:
temperature: Controls output randomness; higher values produce more diverse results.topP: Nucleus Sampling probability threshold.topK: Samples only from the top K most probable tokens.candidateCount: Number of candidate results to return.maxOutputTokens: Maximum number of output tokens.stopSequences: Stops generation when the specified strings are encountered.responseMimeType: Specifies the output format, e.g.text/plain,application/json.responseSchema: JSON Schema structured output constraint.presencePenalty: Reduces repetitive topics and encourages new content generation.frequencyPenalty: Reduces repetitive words or sentences.seed: Fixes the random seed for reproducible results.responseLogprobs: Whether to return token probability information.logprobs: Number of token probabilities to return.audioTimestamp: Whether to return audio timestamps.speechConfig: Speech output configuration (TTS).thinkingConfig: Gemini Thinking model reasoning configuration.
Safety policy configuration. Each item contains
category (risk category) and threshold (risk blocking level).Supported risk categories: HARM_CATEGORY_HATE_SPEECH, HARM_CATEGORY_HARASSMENT, HARM_CATEGORY_SEXUALLY_EXPLICIT, HARM_CATEGORY_DANGEROUS_CONTENT.Gemini tool declaration list, used to enable function calling, search, code execution, and other capabilities. Supports the following tool types:
functionDeclarations: Function Calling function declarations.googleSearch: Google search capability.codeExecution: Code execution capability.urlContext: URL content parsing capability.retrieval: Retrieval-Augmented Generation (RAG).googleSearchRetrieval: Google retrieval augmentation.
Tool invocation configuration.Commonly used to configure
functionCallingConfig.mode: AUTO (model automatically decides whether to call tools), ANY (force tool calling), NONE (disable tool calling).allowedFunctionNames: Specifies the list of functions allowed to be called.Gemini Cached Content identifier. Used to reuse context cache, reducing token consumption and response latency for long-context requests.
Request Example
Multimodal Input
Streaming Generation
Response Example
Common Safety Settings
| category | Description |
|---|---|
HARM_CATEGORY_HARASSMENT | Harassment content |
HARM_CATEGORY_HATE_SPEECH | Hate speech |
HARM_CATEGORY_SEXUALLY_EXPLICIT | Sexually explicit content |
HARM_CATEGORY_DANGEROUS_CONTENT | Dangerous content |
| threshold | Description |
|---|---|
BLOCK_NONE | Do not block |
BLOCK_ONLY_HIGH | Block high risk only |
BLOCK_MEDIUM_AND_ABOVE | Block medium risk and above |
BLOCK_LOW_AND_ABOVE | Block low risk and above |