Wan2.5-Preview
Wan2.5-Preview is Alibaba’s multimodal video model in research preview. It supports text to video and image to video with native audio generation for clips around 10 seconds. It offers strong prompt adherence, smooth motion, and multilingual audio for narrative scenes.
API Options
Platform-level options for task execution and delivery.
-
taskType
string required value: videoInference -
Identifier for the type of task being performed
-
taskUUID
string required UUID v4 -
UUID v4 identifier for tracking tasks and matching async responses. Must be unique per task.
-
outputType
string default: URL -
Video output type.
Allowed values 1 value
-
outputFormat
string default: MP4 -
Specifies the file format of the generated output. The available values depend on the task type and the specific model's capabilities.
- `MP4`: Widely supported video container (H.264), recommended for general use.
- `WEBM`: Optimized for web delivery.
- `MOV`: QuickTime format, common in professional workflows (Apple ecosystem).
Allowed values 3 values
-
outputQuality
integer min: 20 max: 99 default: 95 -
Compression quality of the output. Higher values preserve quality but increase file size.
-
webhookURL
string URI -
Specifies a webhook URL where JSON responses will be sent via HTTP POST when generation tasks complete. For batch requests with multiple results, each completed item triggers a separate webhook call as it becomes available.
Learn more 1 resource
- Webhooks PLATFORM
- Webhooks
-
deliveryMethod
string default: async -
Determines how the API delivers task results.
Allowed values 1 value
- Returns an immediate acknowledgment with the task UUID. Poll for results using getResponse. Required for long-running tasks like video generation.
Learn more 1 resource
- Task Polling PLATFORM
-
uploadEndpoint
string URI -
Specifies a URL where the generated content will be automatically uploaded using the HTTP PUT method. The raw binary data of the media file is sent directly as the request body. For secure uploads to cloud storage, use presigned URLs that include temporary authentication credentials.
Common use cases:
- Cloud storage: Upload directly to S3 buckets, Google Cloud Storage, or Azure Blob Storage using presigned URLs.
- CDN integration: Upload to content delivery networks for immediate distribution.
// S3 presigned URL for secure upload https://your-bucket.s3.amazonaws.com/generated/content.mp4?X-Amz-Signature=abc123&X-Amz-Expires=3600 // Google Cloud Storage presigned URL https://storage.googleapis.com/your-bucket/content.jpg?X-Goog-Signature=xyz789 // Custom storage endpoint https://storage.example.com/uploads/generated-image.jpgThe content data will be sent as the request body to the specified URL when generation is complete.
-
safety
object -
Content safety checking configuration for video generation.
Properties 2 properties
-
safety»checkContentcheckContent
boolean default: false -
Enable or disable content safety checking. When enabled, defaults to
fastmode.
-
safety»modemode
string default: none -
Safety checking mode for video generation.
Allowed values 3 values
- Disables checking.
- Checks key frames.
- Checks all frames.
-
-
ttl
integer min: 60 -
Time-to-live (TTL) in seconds for generated content. Only applies when
outputTypeisURL.
-
includeCost
boolean default: false -
Include task cost in the response.
-
numberResults
integer min: 1 max: 4 default: 1 -
Number of results to generate. Each result uses a different seed, producing variations of the same parameters.
Inputs
Input resources for the task (images, audio, etc). These must be nested inside the inputs object.
inputs object.-
inputs»frameImagesframeImages
array of strings or objects items: 1 -
An array of frame-specific image inputs to guide video generation. Each item can be either a plain image input (UUID, URL, Data URI, or Base64) or an object that pairs an image with a target frame position.
The
frameImagesparameter allows you to constrain specific frames within the video sequence, ensuring that particular visual content appears at designated points. This is different fromreferenceImages, which provide overall visual guidance without constraining specific timeline positions.When the
frameparameter is omitted, automatic distribution rules apply:- 1 image: Used as the first frame.
Examples 2 examples
Shorthand format: When you don't need to specify a frame position, you can pass a plain image input directly.
"frameImages": [ "aac49721-1964-481a-ae78-8a4e29b91402" ]Object format: When you need to specify a frame position, use an object with
imageandframe."frameImages": [ { "image": "aac49721-1964-481a-ae78-8a4e29b91402", "frame": "first" } ]Format 1: string[]
-
Image input (UUID, URL, Data URI, or Base64).
Format 2: object[] 2 properties
-
inputs»audioaudio
string -
Audio input (UUID or URL).
Generation Parameters
Core parameters for controlling the generated content.
-
model
string required value: runware:201@1 -
Identifier of the model to use for generation.
Learn more 3 resources
-
positivePrompt
string required min: 1 max: 2000 -
Text prompt describing elements to include in the generated output.
Learn more 2 resources
-
negativePrompt
string min: 1 max: 500 -
Prompt to guide what to exclude from generation. Ignored when guidance is disabled (CFGScale ≤ 1).
Learn more 1 resource
-
width
integer paired with height -
Width of the generated media in pixels.
Learn more 2 resources
-
height
integer paired with width -
Height of the generated media in pixels.
Learn more 2 resources
-
resolution
string -
Resolution preset for the output. When used with input media, automatically matches the aspect ratio from the input.
Allowed values 3 values
-
duration
float default: 5 -
Length of the generated video in seconds. The total number of frames produced is determined by duration multiplied by the model's frame rate (fps).
Allowed values 2 values
-
seed
integer min: 0 max: 2147483647 -
Random seed for reproducible generation. When not provided, a random seed is generated in the unsigned 32-bit range.
Provider Settings
Parameters specific to this model provider. These must be nested inside the providerSettings.alibaba object.
providerSettings.alibaba object.-
providerSettings»alibaba»audioaudio
boolean default: true -
Generate native audio aligned with visual content.
-
providerSettings»alibaba»promptExtendpromptExtend
boolean default: true -
Enable LLM-based prompt rewriting to expand and clarify inputs. Affects reproducibility.
Lantern Market Rainy Alley
{
"taskType": "videoInference",
"taskUUID": "ba5cf08d-a52b-44ca-bb54-da1f3d835108",
"model": "runware:201@1",
"positivePrompt": "Animate this first-frame image into a cinematic rainy night market sequence. The camera slowly pushes forward down the lantern-lit alley as rain ripples through puddles and reflections shimmer across wet cobblestones. Steam curls from the tea stall, hanging lanterns sway gently, fabric awnings flutter, and distant shoppers pass through the background with natural parallax. The woman with the transparent umbrella subtly shifts her posture and turns her head slightly as if noticing something ahead. A scooter light flickers softly, neon sign glow blooms in the mist, and the scene feels intimate, nostalgic, and alive. Realistic motion, strong depth, smooth transitions, detailed environmental movement, immersive street ambience with rain, soft chatter, porcelain cups, scooter hum, and market sounds. No subtitles, no text overlays, no logo.",
"negativePrompt": "blurry motion, distorted anatomy, extra limbs, flicker, abrupt camera shake, duplicated people, warped umbrellas, melting objects, low detail, overexposed highlights, text, watermark, subtitles",
"width": 1280,
"height": 720,
"duration": 10,
"seed": 81567,
"providerSettings": {
"alibaba": {
"promptExtend": true,
"audio": true
}
},
"inputs": {
"frameImages": [
{
"image": "https://assets.runware.ai/assets/inputs/77c1bf52-2488-4511-b486-60431656fda3.jpg",
"frame": "first"
}
]
}
}{
"taskType": "videoInference",
"taskUUID": "ba5cf08d-a52b-44ca-bb54-da1f3d835108",
"videoUUID": "9899db5a-2d3a-4f70-86c2-7b002eeaa54e",
"videoURL": "https://vm.runware.ai/video/os/a23d05/ws/5/vi/9899db5a-2d3a-4f70-86c2-7b002eeaa54e.mp4",
"seed": 81567,
"cost": 1.3573
}Moonlit Desert Observatory Ritual
{
"taskType": "videoInference",
"taskUUID": "be8fd83d-7116-4984-b37e-391df5ee434e",
"model": "runware:201@1",
"positivePrompt": "A cinematic wide shot of an ancient stone observatory in a silver-blue desert at midnight, ringed by softly glowing brass instruments and drifting lantern kites. A young astronomer in indigo robes stands on the central platform, tracing constellations in the air with a luminous astrolabe wand. As she moves, thin threads of light connect stars above to engraved maps below, and a circular mechanism slowly rotates under her feet. Fine desert dust lifts in gentle spirals, fabric ripples in the wind, candle flames flicker, and the camera performs a slow graceful dolly-in with subtle parallax. The sky is packed with bright stars, a large crescent moon, and a faint comet crossing the horizon. Native audio: soft desert wind, distant metal chimes, subtle stone mechanism hum, quiet footstep movement, and a hushed female voice in Arabic delivering a brief poetic line about reading the sky. Highly detailed, magical realism, elegant lighting, smooth natural motion, premium cinematic color grading.",
"negativePrompt": "low detail, jittery motion, distorted anatomy, extra limbs, duplicated objects, blurry face, text overlays, watermark, logo, harsh cuts, overexposed highlights, noisy audio",
"width": 1280,
"height": 720,
"duration": 10,
"seed": 43365,
"providerSettings": {
"alibaba": {
"promptExtend": true,
"audio": true
}
}
}{
"taskType": "videoInference",
"taskUUID": "be8fd83d-7116-4984-b37e-391df5ee434e",
"videoUUID": "d2d10db4-4419-4b8b-ad9c-5179d091651a",
"videoURL": "https://vm.runware.ai/video/os/a20d05/ws/5/vi/d2d10db4-4419-4b8b-ad9c-5179d091651a.mp4",
"seed": 43365,
"cost": 0.9049
}Lantern Market Monsoon Night
{
"taskType": "videoInference",
"taskUUID": "abc21c3b-dd34-4269-82cc-df02c9760d99",
"model": "runware:201@1",
"positivePrompt": "A cinematic night market during a warm monsoon rain in a historic riverside alley, glowing red and gold paper lanterns reflected in puddles, steam rising from street-food stalls, silk banners fluttering, shoppers moving under clear umbrellas, a tea seller in the foreground pouring from a long-spouted kettle in slow elegant arcs, camera begins with a low close-up on raindrops splashing into a puddle then gently dollies forward through the crowd toward a small covered stage where a young woman tells a short story in Mandarin with calm expressive narration, natural lip sync, surrounding ambience of rainfall, footsteps, distant laughter, sizzling noodles, clinking bowls, soft market chatter, occasional bicycle bell, richly textured cinematic lighting, realistic human motion, shallow depth of field, subtle handheld feel, high detail, immersive atmosphere, coherent action from start to finish",
"negativePrompt": "blurry faces, distorted hands, frozen crowd, extra limbs, text overlays, subtitles, watermark, logo, flicker, stutter, low detail, overexposed highlights, unrealistic rain, duplicated people, broken anatomy, camera shake",
"width": 1280,
"height": 720,
"duration": 10,
"seed": 63746,
"providerSettings": {
"alibaba": {
"promptExtend": true,
"audio": true
}
}
}{
"taskType": "videoInference",
"taskUUID": "abc21c3b-dd34-4269-82cc-df02c9760d99",
"videoUUID": "e38b8c72-98b7-4904-ac66-d4198b5a008c",
"videoURL": "https://vm.runware.ai/video/os/a07d11/ws/5/vi/e38b8c72-98b7-4904-ac66-d4198b5a008c.mp4",
"seed": 63746,
"cost": 0.9049
}Lantern Market Rainy Midnight
{
"taskType": "videoInference",
"taskUUID": "0d1a7cb5-8bfc-4049-97fe-8d2f73d41474",
"model": "runware:201@1",
"positivePrompt": "A cinematic rainy midnight street market in an old riverside city, glowing red and amber paper lanterns strung overhead, wet cobblestones reflecting neon calligraphy signs, steam rising from noodle stalls, fabric awnings fluttering in the wind, a young woman in a jade-green coat walking slowly through the crowd holding a transparent umbrella, vendors handing bowls across counters, a child reaching for a sugar sculpture, bicycles passing in the background, a riverboat drifting beyond the alley opening. Camera begins with a low puddle reflection shot, then gently tracks beside the woman at eye level, ending on a wider reveal of the lantern-filled market. Realistic motion, atmospheric depth, subtle rack focus, natural crowd behavior, rain droplets on lens, richly textured clothing, high-detail reflections. Native ambient audio: soft rainfall, market chatter in Mandarin, sizzling woks, distant boat horn, footsteps splashing in puddles, no music.",
"negativePrompt": "blurry faces, frozen motion, extra limbs, flickering lanterns, distorted hands, oversaturated colors, low detail, text artifacts, subtitles, logo, watermark, abrupt cuts",
"width": 1280,
"height": 720,
"duration": 10,
"seed": 38581,
"providerSettings": {
"alibaba": {
"promptExtend": true,
"audio": true
}
}
}{
"taskType": "videoInference",
"taskUUID": "0d1a7cb5-8bfc-4049-97fe-8d2f73d41474",
"videoUUID": "e249930b-a5c1-42ae-8feb-e9db65264650",
"videoURL": "https://vm.runware.ai/video/os/a08d21/ws/5/vi/e249930b-a5c1-42ae-8feb-e9db65264650.mp4",
"seed": 38581,
"cost": 0.9049
}