POST
https://octopusx.ai
/
v1
/
video
/
create
Create Video
curl --request POST \
  --url https://octopusx.ai/v1/video/create \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "prompt": "<string>",
  "images": [
    "<string>"
  ],
  "aspect_ratio": "<string>",
  "size": "<string>",
  "duration": 123
}
'
{
  "id": "grok:48a67431-0708-46d1-9ab9-83cb84700153",
  "status": "processing",
  "status_update_time": 1762780400,
  "task_id": "48038932-0ff5-4251-8b4b-7a76c09fd114",
  "created_at": "2025-11-08T23:07:57.510141923+08:00"
}

Create Video

The Grok unified video entry point uses POST /v1/video/create, and the request body is JSON. Unlike Grok video generation in OpenAI format, this endpoint uses fields such as images, aspect_ratio, and size, and supports referencing multiple images in prompt via @img1, @img2.
  • The routing entry point is POST /v1/video/create.
  • Reference images are passed through the images array as URLs or base64; text-to-video can pass [].
  • The common model is grok-video-3; use the model actually available on the current channel.
  • After a successful submission, id or task_id and status are returned. Use Query Task to poll for the result later.

Method and Path

POST /v1/video/create

Request Example

# Text-to-video / first and last frame (images is empty or images are provided per scenario)
curl -X POST https://octopusx.ai/v1/video/create \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -d '{
    "model": "grok-video-3",
    "prompt": "cat fish --mode=custom",
    "images": [],
    "aspect_ratio": "3:2",
    "size": "1080P",
    "duration": 10
  }'

Response Example

{
  "id": "grok:48a67431-0708-46d1-9ab9-83cb84700153",
  "status": "processing",
  "status_update_time": 1762780400,
  "task_id": "48038932-0ff5-4251-8b4b-7a76c09fd114",
  "created_at": "2025-11-08T23:07:57.510141923+08:00"
}
The actual response fields may vary slightly by channel. Please use id or task_id in the response as the credential for subsequent queries.

Authentication

Authorization: Bearer YOUR_API_KEY

Body

model
string
required
Model name, for example grok-video-3.
prompt
string
required
Prompt. When using multi-image reference, you can use placeholders such as @img1 and @img2 in the text, corresponding to the order of indices in the images array.
images
array<string>
required
List of reference images, where each element is a URL or a base64 data URI. Text-to-video can pass []; first-and-last-frame inputs are usually provided as 2 images in order; multi-image reference supports up to 6 images.
aspect_ratio
string
required
Video aspect ratio. Optional values are 16:9, 9:16, 2:3, 3:2, 1:1.
size
string
required
Resolution specification, pass 720P or 1080P.
duration
integer
Video duration in seconds. Default is 10; supports 6, 10, and 15.

Response

id
string
Task ID, used as the id parameter when querying; some responses may return only task_id.
task_id
string
Upstream task ID, which may coexist with id; subject to the actual response.
status
string
Task status. Common values include processing, completed, and failed.
status_update_time
integer
Most recent status update time (Unix timestamp).
created_at
string
Creation time, which in some responses is an RFC3339 string.