PixVerse V5.5
PixVerse V5.5 is a director focused video model for story driven clips. It supports multi image fusion for character continuity, multi shot sequences, and native audio. It delivers smooth motion, refined cinematic control, and precise text guided video generation for complex scenes.
API Options
Platform-level options for task execution and delivery.
-
taskType
string required value: videoInference -
Identifier for the type of task being performed
-
taskUUID
string required UUID v4 -
UUID v4 identifier for tracking tasks and matching async responses. Must be unique per task.
-
outputType
string default: URL -
Video output type.
Allowed values 1 value
-
outputFormat
string default: MP4 -
Specifies the file format of the generated output. The available values depend on the task type and the specific model's capabilities.
- `MP4`: Widely supported video container (H.264), recommended for general use.
- `WEBM`: Optimized for web delivery.
- `MOV`: QuickTime format, common in professional workflows (Apple ecosystem).
Allowed values 3 values
-
outputQuality
integer min: 20 max: 99 default: 95 -
Compression quality of the output. Higher values preserve quality but increase file size.
-
webhookURL
string URI -
Specifies a webhook URL where JSON responses will be sent via HTTP POST when generation tasks complete. For batch requests with multiple results, each completed item triggers a separate webhook call as it becomes available.
Learn more 1 resource
- Webhooks PLATFORM
- Webhooks
-
deliveryMethod
string default: async -
Determines how the API delivers task results.
Allowed values 1 value
- Returns an immediate acknowledgment with the task UUID. Poll for results using getResponse. Required for long-running tasks like video generation.
Learn more 1 resource
- Task Polling PLATFORM
-
uploadEndpoint
string URI -
Specifies a URL where the generated content will be automatically uploaded using the HTTP PUT method. The raw binary data of the media file is sent directly as the request body. For secure uploads to cloud storage, use presigned URLs that include temporary authentication credentials.
Common use cases:
- Cloud storage: Upload directly to S3 buckets, Google Cloud Storage, or Azure Blob Storage using presigned URLs.
- CDN integration: Upload to content delivery networks for immediate distribution.
// S3 presigned URL for secure upload https://your-bucket.s3.amazonaws.com/generated/content.mp4?X-Amz-Signature=abc123&X-Amz-Expires=3600 // Google Cloud Storage presigned URL https://storage.googleapis.com/your-bucket/content.jpg?X-Goog-Signature=xyz789 // Custom storage endpoint https://storage.example.com/uploads/generated-image.jpgThe content data will be sent as the request body to the specified URL when generation is complete.
-
safety
object -
Content safety checking configuration for video generation.
Properties 2 properties
-
safety»checkContentcheckContent
boolean default: false -
Enable or disable content safety checking. When enabled, defaults to
fastmode.
-
safety»modemode
string default: none -
Safety checking mode for video generation.
Allowed values 3 values
- Disables checking.
- Checks key frames.
- Checks all frames.
-
-
ttl
integer min: 60 -
Time-to-live (TTL) in seconds for generated content. Only applies when
outputTypeisURL.
-
includeCost
boolean default: false -
Include task cost in the response.
-
numberResults
integer min: 1 max: 4 default: 1 -
Number of results to generate. Each result uses a different seed, producing variations of the same parameters.
Inputs
Input resources for the task (images, audio, etc). These must be nested inside the inputs object.
inputs object.-
inputs»frameImagesframeImages
array of strings or objects min items: 1max items: 2 -
An array of frame-specific image inputs to guide video generation. Each item can be either a plain image input (UUID, URL, Data URI, or Base64) or an object that pairs an image with a target frame position.
The
frameImagesparameter allows you to constrain specific frames within the video sequence, ensuring that particular visual content appears at designated points. This is different fromreferenceImages, which provide overall visual guidance without constraining specific timeline positions.When the
frameparameter is omitted, automatic distribution rules apply:- 1 image: Used as the first frame.
- 2 images: First and last frames.
Examples 3 examples
Shorthand format: When you don't need to specify a frame position, you can pass a plain image input directly.
"frameImages": [ "aac49721-1964-481a-ae78-8a4e29b91402" ]Object format: When you need to specify a frame position, use an object with
imageandframe.First and last frames: With two images, they automatically become the first and last frames of the video sequence. You can mix shorthand and object formats."frameImages": [ { "image": "aac49721-1964-481a-ae78-8a4e29b91402", "frame": "first" } ]"frameImages": [ "aac49721-1964-481a-ae78-8a4e29b91402", { "image": "3ad204c3-a9de-4963-8a1a-c3911e3afafe", "frame": "last" } ]Format 1: string[]
-
Image input (UUID, URL, Data URI, or Base64).
Format 2: object[] 2 properties
-
inputs»frameImages»imageimage
string required -
Image input (UUID, URL, Data URI, or Base64).
-
inputs»frameImages»frameframe
object -
Target frame position for the image. Supports first and last frame.
Allowed values 4 values
- First frame of the video.
- Last frame of the video.
- Frame index 0 (first frame).
- Frame index -1 (last frame).
Generation Parameters
Core parameters for controlling the generated content.
-
model
string required value: pixverse:1@6 -
Identifier of the model to use for generation.
Learn more 3 resources
-
positivePrompt
string required min: 2 max: 2048 -
Text prompt describing elements to include in the generated output.
Learn more 2 resources
-
negativePrompt
string min: 2 max: 2048 -
Prompt to guide what to exclude from generation. Ignored when guidance is disabled (CFGScale ≤ 1).
Learn more 1 resource
-
Width of the generated media in pixels.
Learn more 2 resources
-
Height of the generated media in pixels.
Learn more 2 resources
-
resolution
string default: 720p -
Resolution preset for the output. When used with input media, automatically matches the aspect ratio from the input.
Allowed values 4 values
-
duration
float -
Length of the generated video in seconds. The total number of frames produced is determined by duration multiplied by the model's frame rate (fps).
-
seed
integer min: 0 max: 2147483647 -
Random seed for reproducible generation. When not provided, a random seed is generated in the unsigned 32-bit range.
Provider Settings
Parameters specific to this model provider. These must be nested inside the providerSettings.pixverse object.
providerSettings.pixverse object.-
providerSettings»pixverse»audioaudio
boolean default: false -
Enable audio generation.
-
providerSettings»pixverse»multiClipmultiClip
boolean default: false -
Enable multi-shot generation with varying camera angles.
-
providerSettings»pixverse»stylestyle
string -
Artistic style aesthetic for video generation.
Allowed values 5 values
- Japanese animation aesthetic.
- Three-dimensional animated style with depth.
- Stop-motion clay animation appearance.
- Comic book or graphic novel visual style.
- Futuristic, neon-lit dystopian aesthetic.
-
providerSettings»pixverse»thinkingthinking
string default: auto -
Enhanced reasoning mode.
Allowed values 3 values
- Max understanding.
- Faster generation.
- Automatic.
Neon Monsoon Rooftop Chase
{
"taskType": "videoInference",
"taskUUID": "86513e43-4646-43df-99d3-415bded9b0b5",
"model": "pixverse:1@6",
"positivePrompt": "A cinematic cyberpunk rooftop pursuit during a midnight monsoon in a vast neon megacity. A lone courier in a reflective crimson coat sprints across slick rooftops carrying a glowing data capsule while surveillance drones weave through steam and rain. Multi-shot sequence: wide establishing shot of towering holographic billboards and storm clouds, low tracking shot of boots splashing through puddles, side angle as the courier vaults a narrow alley gap, close-up of rain on determined eyes, dramatic over-shoulder shot revealing drones closing in, final heroic pause on the rooftop edge with the skyline blazing behind. Strong contrast, electric blue and magenta reflections, dense atmosphere, smooth character motion, realistic rain interaction, cinematic lighting, dramatic pacing, highly polished story-driven composition. Native audio: heavy rainfall, distant thunder, humming neon signs, drone rotors, wet footsteps, coat flapping in the wind, rising synth pulse, tense futuristic ambience.",
"negativePrompt": "blurry, low detail, jittery motion, warped anatomy, extra limbs, duplicate character, distorted face, flat lighting, washed out colors, text, watermark, logo, subtitles, frame glitches, static camera",
"width": 1280,
"height": 720,
"duration": 8,
"seed": 15605,
"providerSettings": {
"pixverse": {
"style": "cyberpunk",
"audio": true,
"multiClip": true,
"thinking": "enabled"
}
}
}{
"taskType": "videoInference",
"taskUUID": "86513e43-4646-43df-99d3-415bded9b0b5",
"videoUUID": "0b3114d5-397c-45fe-88cc-0b7270e07d7a",
"videoURL": "https://vm.runware.ai/video/os/a15d18/ws/5/vi/0b3114d5-397c-45fe-88cc-0b7270e07d7a.mp4",
"seed": 15605,
"cost": 0.4715
}Neon Bazaar Chase Sequence
{
"taskType": "videoInference",
"taskUUID": "a9cd0a39-0e7d-4fca-9f8a-77ba79268056",
"model": "pixverse:1@6",
"positivePrompt": "A story-driven cyberpunk night market chase in the rain, multi-shot cinematic sequence. Shot 1: wide establishing view of a dense neon bazaar packed with holographic signs, steam, umbrellas, food stalls, reflective pavement, and a masked courier clutching a glowing glass capsule. Shot 2: medium tracking shot as the courier weaves through crowds while two sleek drones pursue overhead, sparks and mist drifting through saturated magenta and teal light. Shot 3: low-angle close shot of boots splashing through puddles, dropped fruit rolling across the ground, stalls rattling, dramatic parallax and fast motion. Shot 4: dynamic side view as the courier leaps across narrow vendor counters beneath hanging lantern cables. Shot 5: final rooftop reveal at dawn-blue horizon, the courier turns back breathless, city skyline blazing behind, capsule illuminating their face. Highly cinematic, smooth motion, strong continuity, realistic rain physics, atmospheric depth, expressive lighting, polished blockbuster pacing, immersive environmental sound and chase audio.",
"negativePrompt": "low detail, blurry subjects, broken anatomy, extra limbs, duplicated people, warped faces, jittery motion, flicker, smeared objects, flat lighting, empty background, text overlays, subtitles, logos, watermark",
"width": 1280,
"height": 720,
"duration": 8,
"seed": 47883,
"providerSettings": {
"pixverse": {
"style": "cyberpunk",
"audio": true,
"multiClip": true,
"thinking": "enabled"
}
}
}{
"taskType": "videoInference",
"taskUUID": "a9cd0a39-0e7d-4fca-9f8a-77ba79268056",
"videoUUID": "cd92586a-1525-42f4-a14d-8a28cbe71892",
"videoURL": "https://vm.runware.ai/video/os/a16d07/ws/5/vi/cd92586a-1525-42f4-a14d-8a28cbe71892.mp4",
"seed": 47883,
"cost": 0.4715
}Storm-Bound Lighthouse Passage
{
"taskType": "videoInference",
"taskUUID": "1f4eeb96-ee66-4b34-b094-34511cd1b3d3",
"model": "pixverse:1@6",
"positivePrompt": "A story-driven cinematic sequence beginning on jagged seaside rocks during a violent twilight storm and ending inside a lighthouse at dawn. Use the first frame image as the opening shot and the last frame image as the closing shot. Show the same woman in a yellow raincoat making her way along a narrow cliff path, waves crashing below, close-ups of boots splashing through puddles, hands gripping a rusted railing, cut to a low-angle shot of the lighthouse beam slicing through rain, interior spiral staircase lit by swinging emergency lamps, breathless ascent, then a quiet final reveal in the lantern room as the storm breaks and warm sunrise light floods in. Smooth camera motion, strong visual continuity, rich atmosphere, realistic cinematic lighting, dramatic soundscape with wind, thunder, ocean surf, footsteps, metal rattling, and a subtle emotional orchestral swell near the end.",
"negativePrompt": "extra characters, distorted anatomy, duplicate subject, low detail, blurry frames, flicker, jitter, broken continuity, text, watermark, logo, oversaturated colors, cartoonish faces, abrupt scene changes",
"width": 1280,
"height": 720,
"duration": 8,
"seed": 96626,
"providerSettings": {
"pixverse": {
"style": "3d_animation",
"audio": true,
"multiClip": true,
"thinking": "enabled"
}
},
"inputs": {
"frameImages": [
{
"inputImage": "https://assets.runware.ai/assets/inputs/b856e6a2-8fa4-49b7-9731-6b157d94e9b9.jpg",
"frame": "first"
},
{
"inputImage": "https://assets.runware.ai/assets/inputs/729c38a5-b922-4c5a-af4b-c386d9c4350b.jpg",
"frame": "last"
}
]
}
}{
"taskType": "videoInference",
"taskUUID": "1f4eeb96-ee66-4b34-b094-34511cd1b3d3",
"videoUUID": "6a5ade1c-56cd-41da-a5bb-49f8ab00b403",
"videoURL": "https://vm.runware.ai/video/os/a23d05/ws/5/vi/6a5ade1c-56cd-41da-a5bb-49f8ab00b403.mp4",
"seed": 96626,
"cost": 0.4715
}Lantern-Festival Rooftop Farewell
{
"taskType": "videoInference",
"taskUUID": "179fefbf-5352-479b-b8f4-d570a27dd3cb",
"model": "pixverse:1@6",
"positivePrompt": "Using the provided first frame as the opening shot, create a story-driven cinematic video of a quiet rooftop farewell during a lantern festival in an ancient river city. The woman stands still at first, then slowly steps toward the roof edge as hundreds of glowing lanterns rise into the night sky. Her coat and hair move naturally in the wind. The camera begins with an intimate close framing that matches the reference image, then expands into elegant multi-shot coverage: over-the-shoulder view of lanterns drifting upward, low angle with moonlit clouds, wide aerial reveal of the city reflecting gold light on the river, and a final lingering profile shot filled with emotion. Subtle acting, refined facial detail, smooth motion, atmospheric depth, cinematic lighting, volumetric haze, realistic fabric physics, poetic mood, delicate ambient festival audio with wind, distant bells, soft crowd murmur, and fluttering paper lantern flames.",
"negativePrompt": "blurry, low detail, distorted anatomy, extra fingers, duplicate person, flickering face, bad hands, warped architecture, text, watermark, logo, harsh cuts, jittery motion, oversaturated neon, horror elements, modern cars, daytime",
"width": 1280,
"height": 720,
"duration": 8,
"seed": 80065,
"providerSettings": {
"pixverse": {
"audio": true,
"multiClip": true,
"thinking": "enabled"
}
},
"inputs": {
"frameImages": [
{
"inputImage": "https://assets.runware.ai/assets/inputs/3f9a713d-33c2-4776-90ad-2e293f3c59b2.jpg",
"frame": "first"
}
]
}
}{
"taskType": "videoInference",
"taskUUID": "179fefbf-5352-479b-b8f4-d570a27dd3cb",
"videoUUID": "25d658cf-0719-4805-92b0-c904a258637a",
"videoURL": "https://vm.runware.ai/video/os/a20d05/ws/5/vi/25d658cf-0719-4805-92b0-c904a258637a.mp4",
"seed": 80065,
"cost": 0.4715
}Moonlit Clockwork Carnival Procession
{
"taskType": "videoInference",
"taskUUID": "94759f56-09e3-42a2-a89d-3d288a8d2055",
"model": "pixverse:1@6",
"positivePrompt": "A surreal midnight carnival crossing a foggy old stone bridge above a black river, led by a towering clockwork elephant with glowing amber eyes and polished brass joints, followed by masked performers carrying lanterns shaped like moons, silk banners fluttering in the wind, sparks drifting from mechanical calliope wagons, reflections rippling in the water below. Multi-shot cinematic sequence: wide establishing shot of the moonlit bridge and ruined gothic city in the background, low angle tracking shot beside the elephant’s moving gears and stamping feet, medium shot of a young ringmaster in a deep teal coat lifting a silver baton as lantern light flickers across their face, overhead shot of the procession winding through blue fog, final close shot of a carousel horse automaton turning its head toward camera as distant fireworks bloom behind cathedral spires. Rich atmosphere, elegant camera movement, dramatic depth, volumetric moonlight, highly detailed textures, whimsical yet haunting mood, refined cinematic composition, immersive environmental sound with soft calliope music, mechanical creaks, footsteps on stone, river ambience, and faraway fireworks.",
"negativePrompt": "blurry, low detail, distorted anatomy, extra limbs, duplicate subjects, warped faces, unreadable objects, noisy image, jittery motion, oversaturated colors, flat lighting, text overlays, watermarks, logos",
"width": 1280,
"height": 720,
"duration": 8,
"seed": 44372,
"providerSettings": {
"pixverse": {
"style": "3d_animation",
"audio": true,
"multiClip": true,
"thinking": "enabled"
}
}
}{
"taskType": "videoInference",
"taskUUID": "94759f56-09e3-42a2-a89d-3d288a8d2055",
"videoUUID": "8ea9fe34-b50a-4bc4-806a-5a38b8b0841d",
"videoURL": "https://vm.runware.ai/video/os/a15d18/ws/5/vi/8ea9fe34-b50a-4bc4-806a-5a38b8b0841d.mp4",
"seed": 44372,
"cost": 0.4715
}