Create Video

curl --request POST \
  --url https://octopusx.ai/v1/video/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "content": [
    {}
  ],
  "content[].type": "<string>",
  "content[].text": "<string>",
  "content[].image_url": {},
  "content[].image_url.url": "<string>",
  "content[].video_url": {},
  "content[].video_url.url": "<string>",
  "content[].audio_url": {},
  "content[].audio_url.url": "<string>",
  "content[].draft_task": {},
  "content[].draft_task.id": "<string>",
  "content[].role": "<string>",
  "metadata": {},
  "metadata.duration": 123,
  "metadata.resolution": "<string>",
  "metadata.ratio": "<string>",
  "metadata.frames": 123,
  "metadata.seed": 123,
  "metadata.camera_fixed": true,
  "metadata.watermark": true,
  "metadata.generate_audio": true,
  "metadata.return_last_frame": true,
  "metadata.draft": true,
  "metadata.service_tier": "<string>",
  "metadata.execution_expires_after": 123,
  "metadata.callback_url": "<string>"
}
'

{
  "task_id": "cgt-20260412163502-x8k2m"
}

POST

https://octopusx.ai

video

generations

Create Video

curl --request POST \
  --url https://octopusx.ai/v1/video/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "content": [
    {}
  ],
  "content[].type": "<string>",
  "content[].text": "<string>",
  "content[].image_url": {},
  "content[].image_url.url": "<string>",
  "content[].video_url": {},
  "content[].video_url.url": "<string>",
  "content[].audio_url": {},
  "content[].audio_url.url": "<string>",
  "content[].draft_task": {},
  "content[].draft_task.id": "<string>",
  "content[].role": "<string>",
  "metadata": {},
  "metadata.duration": 123,
  "metadata.resolution": "<string>",
  "metadata.ratio": "<string>",
  "metadata.frames": 123,
  "metadata.seed": 123,
  "metadata.camera_fixed": true,
  "metadata.watermark": true,
  "metadata.generate_audio": true,
  "metadata.return_last_frame": true,
  "metadata.draft": true,
  "metadata.service_tier": "<string>",
  "metadata.execution_expires_after": 123,
  "metadata.callback_url": "<string>"
}
'

{
  "task_id": "cgt-20260412163502-x8k2m"
}

Create Video

Submit a Seedance 2.0 video generation task. It supports text-to-video, first-frame/first-and-last-frame, reference image/video/audio, video continuation, video editing, and multimodal composition modes. For assets in the media library, it is recommended to reference them in content using asset://{assetId} (see Upload Assets).

Method and Path

POST /v1/video/generations

Request Examples

{
  "model": "doubao-seedance-2-0-260128",
  "content": [
    {
      "type": "text",
      "text": "First-person view fruit tea ad: 0-2s picking apples by hand; 2-4s cut to pouring into a shaker cup and shaking; 4-6s close-up of pouring into a clear cup; 6-8s raise the cup toward the camera"
    },
    {
      "type": "image_url",
      "image_url": { "url": "https://example.com/apple.jpg" },
      "role": "reference_image"
    },
    {
      "type": "image_url",
      "image_url": { "url": "https://example.com/cup.jpg" },
      "role": "reference_image"
    },
    {
      "type": "video_url",
      "video_url": { "url": "https://example.com/pov_reference.mp4" },
      "role": "reference_video"
    },
    {
      "type": "audio_url",
      "audio_url": { "url": "https://example.com/bgm.mp3" },
      "role": "reference_audio"
    }
  ],
  "metadata": {
    "duration": 8,
    "resolution": "720p",
    "ratio": "16:9",
    "generate_audio": true,
    "watermark": false
  }
}

Response Examples

{
  "task_id": "cgt-20260412163502-x8k2m"
}

After submission, use Query Task to poll task_id.

Authentication

Authorization: Bearer YOUR_API_KEY

Body

model

string

required

Model name:

doubao-seedance-2-0-260128: Standard version, optimized for the best visual quality and complex shot planning
doubao-seedance-2-0-fast-260128: Fast version, optimized for low latency and cost-sensitive scenarios

content

array<object>

required

Multimodal input array; the order affects role assignment.

content[].type

string

required

Content type: text, image_url, video_url, audio_url, draft_task.

content[].text

string

Required when type=text; prompt text.

content[].image_url

object

Used when type=image_url; must include url.

content[].image_url.url

string

required

Public image URL or asset reference asset://{assetId}.

content[].video_url

object

Used when type=video_url; must include url.

content[].video_url.url

string

required

Public video URL or asset://{assetId}.

content[].audio_url

object

Used when type=audio_url; must include url.

content[].audio_url.url

string

required

Public audio URL or asset://{assetId}.

content[].draft_task

object

Used when type=draft_task; must include id, and it must be the only element in content.

content[].draft_task.id

string

required

Draft task ID, used to continue generation from a draft.

content[].role

string

Media role:

first_frame: first frame (image)
last_frame: last frame (image)
reference_image: reference image
reference_video: reference/source video (continuation, editing)
reference_audio: reference audio (requires metadata.generate_audio=true)

metadata

object

Video generation parameters; all are optional.

metadata.duration

integer

Video duration in seconds. Valid range [4, 15] or -1 (automatically determined by the model), default 5.

metadata.resolution

string

Resolution: 480p, 720p, 1080p, default 720p.

metadata.ratio

string

Aspect ratio: 16:9, 9:16, 1:1, 4:3, adaptive, default 16:9.

metadata.frames

integer

Total number of video frames. Mutually exclusive with duration; if frames is provided, it takes precedence over duration.

metadata.seed

integer

Random seed. The same seed plus the same input can produce similar results.

metadata.camera_fixed

boolean

Whether to keep the camera fixed (suppress camera movement), default false.

metadata.watermark

boolean

Whether to add a watermark in the bottom-right corner of the video, default true.

metadata.generate_audio

boolean

Whether to generate or synthesize audio. Must be true when using reference_audio, default false.

metadata.return_last_frame

boolean

Whether to return the final frame image URL for subsequent continuation, default false.

metadata.draft

boolean

Draft mode: faster generation with slightly lower quality, suitable for previews, default false.

metadata.service_tier

string

Service tier, default default.

metadata.execution_expires_after

integer

Maximum task execution time in seconds, range [3600, 259200] (1 hour to 3 days), default 172800.

metadata.callback_url

string

Callback URL when the task is completed.

content Mixing Rules

Violating the following rules may return 400:

reference_image cannot appear together with first_frame / last_frame
audio_url cannot be the only input in content; it must be paired with at least an image or video
draft_task must be the only element in the content array

Generation Mode Comparison

Mode	Request Example Label	`content` Key Points
Text to Video	Text to Video	`text` + optional reference image/video/audio
First-Frame Image to Video	First-Frame Image to Video	`text` + `first_frame`
First-and-Last-Frame Image to Video	First-and-Last-Frame Image to Video	`text` + `first_frame` + `last_frame`
Reference Image to Video	Reference Image to Video	`text` + `reference_image`
Video Continuation	Video Continuation	`text` + `reference_video`
Video Editing	Video Editing	`text` + `reference_video` + `reference_image`
Multimodal Composition	Multimodal Composition	`text` + multiple types of references
Reference Assets	Reference Assets	Each URL uses `asset://{assetId}`

Response

task_id

string

Task ID, used for Query Task.

Query Asset Query Task

​Create Video

​Method and Path

​Request Examples

​Response Examples

​Authentication

​Body

​content Mixing Rules

​Generation Mode Comparison

​Response

​Related Pages

Create Video

Method and Path

Request Examples

Response Examples

Authentication

Body

content Mixing Rules

Generation Mode Comparison

Response

Related Pages