Vidu Reference-to-Video
Official Format
Vidu Reference-to-Video
Use POST /vidu/ent/v2/reference2video to submit a subject-reference video generation task.
POST
Vidu Reference-to-Video
Vidu Reference-to-Video
Vidu official format reference-to-video API, submitted asapplication/json.
- The routing endpoint is
POST /vidu/ent/v2/reference2video. - Currently submitted as
application/json. - Supports subject-reference video generation (multiple subjects).
- After a successful submission, the task
task_idandstatusare returned. Use Query Task to poll for results.
Supported Models
viduq3-pro: Efficiently generates high-quality audio and video content, making videos more vivid, realistic, and three-dimensionalviduq2-pro/viduq2-turbo: New models with good results and rich detailsviduq2-pro-fast: Lowest price, fast generation speedviduq1/viduq1-classic: Clear visuals and stable camera movementvidu2.0: Fast generation speed
Method and Path
Request Example
Response Example
Authentication
Body
Model name. Supported:
viduq3-pro, viduq2-pro, viduq2-turbo, viduq2-pro-fast, viduq1, viduq1-classic, vidu2.0.Subject array. Multiple subjects are supported. Each subject includes
id (subject ID), images (image URL(s) corresponding to the subject, up to 3 images per subject), and voice_id (voice ID, optional).Subject ID. Can be referenced later using
@subject ID during generation.Subject images. Note 1: Image Base64 encoding or image URL is supported. Note 2: Supported image formats: png, jpeg, jpg, webp. Note 3: Image dimensions must not be smaller than 128*128, the ratio must be less than 1:4 or 4:1, and the size must not exceed 50M. Note 4: The POST body of the HTTP request must not exceed 20MB, and the encoding must include the content-type string, for example:
data:image/png;base64,{base64_encode}.Voice ID. Determines the voice timbre in the video. If empty, the system will automatically recommend one. Optional enum values refer to: New Voice List, or use the Voice Cloning API to clone any voice timbre;
voice_id is interoperable.Text prompt. The text description for video generation. You can reference subjects via
@subject ID. If the recommended prompt parameter is_rec is used, the model will ignore the prompt entered in this parameter.Audio/video direct output.
true: output a video with dialogue and background sound; false: output a silent video.Voice ID (global). Determines the voice timbre in the video. If empty, the system will automatically recommend one. Note: does not take effect for q3 models.
Whether to use the recommended prompt.
true: the system automatically recommends a prompt; false: generate video based on the input prompt.Background music.
true: the system will automatically select suitable music from the preset BGM library and add it; false: do not add BGM.Video duration (seconds).
viduq2 series: default 5 seconds, optional 1-10 seconds.Random seed. If omitted or set to 0, a random number is used; if set manually, the specified seed is used.
Resolution. The default value depends on the model and video duration.
viduq2 (1-10 seconds): default 720p, optional 540p, 720p, 1080p.Off-peak mode.
true: generate video during off-peak periods; false: generate video immediately.Whether to add a watermark.
true: add watermark; false: do not add watermark.Watermark position.
1: top-left; 2: top-right; 3: bottom-right; 4: bottom-left.Watermark image URL. When not provided, the default watermark
"内容由 AI 生成" is used.Pass-through parameter. No processing is performed; data transmission only.
Metadata identifier. JSON-formatted string, pass-through field.
Response
Task ID, used to query the task status.
Task status. Optional values:
processing (in progress), failed (failed), completed (completed).Creation timestamp (Unix timestamp).