Vidu Reference-to-Video

curl --request POST \
  --url https://octopusx.ai/vidu/ent/v2/reference2video \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "subjects": [
    {}
  ],
  "subjects[].id": "<string>",
  "subjects[].images": [
    {}
  ],
  "subjects[].voice_id": "<string>",
  "prompt": "<string>",
  "audio": true,
  "voice_id": "<string>",
  "is_rec": true,
  "bgm": true,
  "duration": 123,
  "seed": 123,
  "resolution": "<string>",
  "off_peak": true,
  "watermark": true,
  "wm_position": 123,
  "wm_url": "<string>",
  "payload": "<string>",
  "meta_data": "<string>"
}
'

{
  "task_id": "48038932-0ff5-4251-8b4b-7a76c09fd114",
  "status": "processing",
  "created_at": 1774494511
}

POST

https://octopusx.ai

vidu

ent

reference2video

Vidu Reference-to-Video

curl --request POST \
  --url https://octopusx.ai/vidu/ent/v2/reference2video \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "subjects": [
    {}
  ],
  "subjects[].id": "<string>",
  "subjects[].images": [
    {}
  ],
  "subjects[].voice_id": "<string>",
  "prompt": "<string>",
  "audio": true,
  "voice_id": "<string>",
  "is_rec": true,
  "bgm": true,
  "duration": 123,
  "seed": 123,
  "resolution": "<string>",
  "off_peak": true,
  "watermark": true,
  "wm_position": 123,
  "wm_url": "<string>",
  "payload": "<string>",
  "meta_data": "<string>"
}
'

{
  "task_id": "48038932-0ff5-4251-8b4b-7a76c09fd114",
  "status": "processing",
  "created_at": 1774494511
}

Vidu Reference-to-Video

Vidu official format reference-to-video API, submitted as application/json.

The routing endpoint is POST /vidu/ent/v2/reference2video.
Currently submitted as application/json.
Supports subject-reference video generation (multiple subjects).
After a successful submission, the task task_id and status are returned. Use Query Task to poll for results.

Supported Models

viduq3-pro: Efficiently generates high-quality audio and video content, making videos more vivid, realistic, and three-dimensional
viduq2-pro / viduq2-turbo: New models with good results and rich details
viduq2-pro-fast: Lowest price, fast generation speed
viduq1 / viduq1-classic: Clear visuals and stable camera movement
vidu2.0: Fast generation speed

Method and Path

POST /vidu/ent/v2/reference2video

Request Example

curl -X POST https://octopusx.ai/vidu/ent/v2/reference2video \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "viduq3-pro",
    "subjects": [
      {
        "id": "subject_1",
        "images": ["https://example.com/subject.png"],
        "voice_id": "voice_001"
      }
    ],
    "prompt": "让@subject_1 向前走并微笑",
    "audio": true,
    "is_rec": false,
    "bgm": false,
    "duration": 5,
    "seed": 0,
    "resolution": "720p"
  }'

Response Example

{
  "task_id": "48038932-0ff5-4251-8b4b-7a76c09fd114",
  "status": "processing",
  "created_at": 1774494511
}

Authentication

Authorization: Bearer YOUR_API_KEY

Body

model

string

required

Model name. Supported: viduq3-pro, viduq2-pro, viduq2-turbo, viduq2-pro-fast, viduq1, viduq1-classic, vidu2.0.

subjects

array

required

Subject array. Multiple subjects are supported. Each subject includes id (subject ID), images (image URL(s) corresponding to the subject, up to 3 images per subject), and voice_id (voice ID, optional).

subjects[].id

string

required

Subject ID. Can be referenced later using @subject ID during generation.

subjects[].images

array

required

Subject images. Note 1: Image Base64 encoding or image URL is supported. Note 2: Supported image formats: png, jpeg, jpg, webp. Note 3: Image dimensions must not be smaller than 128*128, the ratio must be less than 1:4 or 4:1, and the size must not exceed 50M. Note 4: The POST body of the HTTP request must not exceed 20MB, and the encoding must include the content-type string, for example: data:image/png;base64,{base64_encode}.

subjects[].voice_id

string

Voice ID. Determines the voice timbre in the video. If empty, the system will automatically recommend one. Optional enum values refer to: New Voice List, or use the Voice Cloning API to clone any voice timbre; voice_id is interoperable.

prompt

string

Text prompt. The text description for video generation. You can reference subjects via @subject ID. If the recommended prompt parameter is_rec is used, the model will ignore the prompt entered in this parameter.

audio

boolean

Audio/video direct output. true: output a video with dialogue and background sound; false: output a silent video.

voice_id

string

Voice ID (global). Determines the voice timbre in the video. If empty, the system will automatically recommend one. Note: does not take effect for q3 models.

is_rec

boolean

Whether to use the recommended prompt. true: the system automatically recommends a prompt; false: generate video based on the input prompt.

bgm

boolean

Background music. true: the system will automatically select suitable music from the preset BGM library and add it; false: do not add BGM.

duration

integer

Video duration (seconds). viduq2 series: default 5 seconds, optional 1-10 seconds.

seed

integer

Random seed. If omitted or set to 0, a random number is used; if set manually, the specified seed is used.

resolution

string

Resolution. The default value depends on the model and video duration. viduq2 (1-10 seconds): default 720p, optional 540p, 720p, 1080p.

off_peak

boolean

Off-peak mode. true: generate video during off-peak periods; false: generate video immediately.

watermark

boolean

Whether to add a watermark. true: add watermark; false: do not add watermark.

wm_position

integer

Watermark position. 1: top-left; 2: top-right; 3: bottom-right; 4: bottom-left.

wm_url

string

Watermark image URL. When not provided, the default watermark "内容由 AI 生成" is used.

payload

string

Pass-through parameter. No processing is performed; data transmission only.

meta_data

string

Metadata identifier. JSON-formatted string, pass-through field.

Response

task_id

string

Task ID, used to query the task status.

status

string

Task status. Optional values: processing (in progress), failed (failed), completed (completed).

created_at

integer

Creation timestamp (Unix timestamp).

Vidu Start and End Frame Video Generation Vidu Task Query

​Vidu Reference-to-Video

​Supported Models

​Method and Path

​Request Example

​Response Example

​Authentication

​Body

​Response

​Related APIs

Vidu Reference-to-Video

Supported Models

Method and Path

Request Example

Response Example

Authentication

Body

Response

Related APIs