Kling VIDEO 3.0 Pro
Kling VIDEO 3.0 Pro is a unified multimodal video model that generates high-quality video with synchronized audio from text or images. It supports reference-guided generation, prompt-based editing, fine control over motion and pacing, and stable temporal coherence for cinematic and narrative clips. Native audio output includes dialogue, ambient sound, and effects aligned to the visuals.
API Options
Platform-level options for task execution and delivery.
-
taskType
string required value: videoInference -
Identifier for the type of task being performed
-
taskUUID
string required UUID v4 -
UUID v4 identifier for tracking tasks and matching async responses. Must be unique per task.
-
outputType
string default: URL -
Video output type.
Allowed values 1 value
-
outputFormat
string default: MP4 -
Specifies the file format of the generated output. The available values depend on the task type and the specific model's capabilities.
- `MP4`: Widely supported video container (H.264), recommended for general use.
- `WEBM`: Optimized for web delivery.
- `MOV`: QuickTime format, common in professional workflows (Apple ecosystem).
Allowed values 3 values
-
outputQuality
integer min: 20 max: 99 default: 95 -
Compression quality of the output. Higher values preserve quality but increase file size.
-
webhookURL
string URI -
Specifies a webhook URL where JSON responses will be sent via HTTP POST when generation tasks complete. For batch requests with multiple results, each completed item triggers a separate webhook call as it becomes available.
Learn more 1 resource
- Webhooks PLATFORM
- Webhooks
-
deliveryMethod
string default: async -
Determines how the API delivers task results.
Allowed values 1 value
- Returns an immediate acknowledgment with the task UUID. Poll for results using getResponse. Required for long-running tasks like video generation.
Learn more 1 resource
- Task Polling PLATFORM
-
uploadEndpoint
string URI -
Specifies a URL where the generated content will be automatically uploaded using the HTTP PUT method. The raw binary data of the media file is sent directly as the request body. For secure uploads to cloud storage, use presigned URLs that include temporary authentication credentials.
Common use cases:
- Cloud storage: Upload directly to S3 buckets, Google Cloud Storage, or Azure Blob Storage using presigned URLs.
- CDN integration: Upload to content delivery networks for immediate distribution.
// S3 presigned URL for secure upload https://your-bucket.s3.amazonaws.com/generated/content.mp4?X-Amz-Signature=abc123&X-Amz-Expires=3600 // Google Cloud Storage presigned URL https://storage.googleapis.com/your-bucket/content.jpg?X-Goog-Signature=xyz789 // Custom storage endpoint https://storage.example.com/uploads/generated-image.jpgThe content data will be sent as the request body to the specified URL when generation is complete.
-
safety
object -
Content safety checking configuration for video generation.
Properties 2 properties
-
safety»checkContentcheckContent
boolean default: false -
Enable or disable content safety checking. When enabled, defaults to
fastmode.
-
safety»modemode
string default: none -
Safety checking mode for video generation.
Allowed values 3 values
- Disables checking.
- Checks key frames.
- Checks all frames.
-
-
ttl
integer min: 60 -
Time-to-live (TTL) in seconds for generated content. Only applies when
outputTypeisURL.
-
includeCost
boolean default: false -
Include task cost in the response.
-
numberResults
integer min: 1 max: 20 default: 1 -
Number of results to generate. Each result uses a different seed, producing variations of the same parameters.
Inputs
Input resources for the task (images, audio, etc). These must be nested inside the inputs object.
inputs object.-
inputs»referenceImagesreferenceImages
array of strings items: 1 -
List of reference images (UUID, URL, Data URI, or Base64).
-
inputs»frameImagesframeImages
array of objects min items: 1max items: 2 -
An array of objects that define key frames to guide video generation. Each object specifies an input image and optionally its position within the video timeline.
The
frameImagesparameter allows you to constrain specific frames within the video sequence, ensuring that particular visual content appears at designated points. This is different fromreferenceImages, which provide overall visual guidance without constraining specific timeline positions.When the
frameparameter is omitted from objects, automatic distribution rules apply:- 1 image: Used as the first frame.
- 2 images: First and last frames.
Examples 2 examples
Single frame (automatic positioning): When only one image is provided, it automatically becomes the first frame of the video.
First and last frames: With two images, they automatically become the first and last frames of the video sequence."frameImages": [ { "image": "aac49721-1964-481a-ae78-8a4e29b91402" } ]"frameImages": [ { "image": "aac49721-1964-481a-ae78-8a4e29b91402", "frame": "first" }, { "image": "3ad204c3-a9de-4963-8a1a-c3911e3afafe", "frame": "last" } ]Properties 2 properties
-
inputs»frameImages»imageimage
string required -
Image input (UUID, URL, Data URI, or Base64).
-
inputs»frameImages»frameframe
object -
Target frame position for the image. Supports first and last frame.
Allowed values 4 values
- First frame of the video.
- Last frame of the video.
- Frame index 0 (first frame).
- Frame index -1 (last frame).
-
inputs»referenceVideosreferenceVideos
array of strings items: 1 -
List of reference videos (UUID, URL).
-
inputs»elementselements
array of objects min items: 1max items: 3 -
Elements allow you to include reusable assets (images, videos, or voices) in your video generation. Each element is identified by an
idand can be referenced in the prompt using<<<element_1>>>,<<<element_2>>>, etc. in order of appearance.An element can contain:
- Images via
frontalImageand optionallyimages(up to 3 additional angles). - Videos via
videos(cannot be combined with images). - Voices via
voices(can only be combined with images, not videos).
Examples 2 examples
Create a new element with an image:
"positivePrompt": "A video of <<<element_1>>> walking through a futuristic city", "inputs": { "elements": [ { "id": "my-character-id", "description": "A young woman with red hair", "frontalImage": "c64351d5-4c59-42f7-95e1-eace013eddab", "tags": ["Character"] } ] }Reuse a previously created element by ID:
"positivePrompt": "A video of <<<element_1>>> sitting in a coffee shop, reading a book", "inputs": { "elements": [ { "id": "my-character-id" } ] }Properties 7 properties
-
inputs»elements»idid
string required -
Unique identifier for this element. Use to create a new element or reference a previously created one.
-
inputs»elements»descriptiondescription
string -
Description of the element.
-
inputs»elements»frontalImagefrontalImage
string -
Frontal reference image for the element. Required when using image-based elements.
-
inputs»elements»imagesimages
array of strings min items: 1max items: 3 -
Reference images for the element. Up to 3 images. Requires frontalImage.
-
inputs»elements»videosvideos
array of strings items: 1 -
Reference video for the element. Cannot be combined with images or voices.
-
inputs»elements»voicesvoices
array of strings items: 1 -
Voice audio for the element. Can only be combined with images, not videos.
-
inputs»elements»tagstags
array of strings min items: 1 -
Classification tags for the element.
- Images via
Generation Parameters
Core parameters for controlling the generated content.
-
model
string required value: klingai:kling-video@3-pro -
Identifier of the model to use for generation.
Learn more 3 resources
-
positivePrompt
string required min: 2 max: 2500 -
Text prompt describing elements to include in the generated output.
Learn more 2 resources
-
negativePrompt
string min: 2 max: 2500 -
Prompt to guide what to exclude from generation. Ignored when guidance is disabled (CFGScale ≤ 1).
Learn more 1 resource
-
Width of the generated media in pixels.
Learn more 2 resources
-
Height of the generated media in pixels.
Learn more 2 resources
-
duration
integer min: 3 max: 15 step: 1 default: 5 -
Duration of the generation in seconds. Total frames = duration × fps.
Provider Settings
Parameters specific to this model provider. These must be nested inside the providerSettings.klingai object.
providerSettings.klingai object.-
providerSettings»klingai»characterOrientationcharacterOrientation
string -
Source for character orientation reference.
Allowed values 2 values
- Match orientation from the reference image.
- Match orientation from the reference video.
-
providerSettings»klingai»keepOriginalSoundkeepOriginalSound
boolean default: false -
Maintain the original sound from the reference video.
Volcanic Glass Violin Recital
{
"taskType": "videoInference",
"taskUUID": "08885c1c-3794-4bb7-adab-b8df47f93c80",
"model": "klingai:kling-video@3-pro",
"positivePrompt": "A cinematic wide shot on a black volcanic shoreline at blue hour: a solitary violinist in a tailored copper-and-charcoal coat performs on a circular platform of dark glass while shallow waves slide over reflective obsidian sand. In the distance, slow lava veins glow through cracked rock formations, sending faint orange light into sea mist. The camera begins with a low tracking move around the performer, then eases into a gentle push-in as the bowing becomes more intense. Hair, coat hems, and drifting steam respond naturally to ocean gusts. Rich synchronized audio: expressive solo violin melody, soft surf, distant seabird cries, occasional hiss of hot stone meeting water, subtle foot movement on wet glass. Realistic body mechanics, detailed hands and bow contact, nuanced facial focus, high dynamic range lighting, elegant lens bloom on highlights, immersive atmosphere, polished cinematic color grading, stable motion, coherent reflections, no abrupt cuts.",
"negativePrompt": "low detail, blurry hands, extra limbs, warped violin, duplicated person, jittery motion, flicker, broken reflections, overexposed highlights, text, watermark, logo, subtitle, frame artifacts, camera shake, cartoonish anatomy",
"width": 1920,
"height": 1080,
"duration": 10
}{
"taskType": "videoInference",
"taskUUID": "08885c1c-3794-4bb7-adab-b8df47f93c80",
"videoUUID": "baf38118-1b77-4a2b-9ec1-78a697bd135f",
"videoURL": "https://vm.runware.ai/video/os/a18d05/ws/5/vi/baf38118-1b77-4a2b-9ec1-78a697bd135f.mp4",
"cost": 1.12
}Amber Observatory Ice Plain
{
"taskType": "videoInference",
"taskUUID": "70b56006-db71-49e5-a2a1-acf3b771a505",
"model": "klingai:kling-video@3-pro",
"positivePrompt": "A cinematic wide shot of a remote polar observatory built on a vast cracked ice plain under a copper-gold sky. Massive parabolic antenna dishes slowly rotate while tiny maintenance drones skim over the surface leaving faint blue guide lights. In the foreground, a lone researcher in a reflective thermal suit walks toward the main dome, dragging a compact sled loaded with instruments. The camera begins low near textured ice, then glides forward and rises into a gentle sweeping reveal of the station and horizon. Far away, curtains of charged particles ripple across the sky in unusual geometric bands. Snow dust spirals lightly around metal structures, warning beacons pulse softly, and the observatory emits layered mechanical ambience. Native audio: distant antenna motors, crisp footsteps on ice, sled runners scraping, radio static bursts, low electrical hum, occasional wind gusts, and one short calm spoken line over headset: \"Signal lock confirmed.\" Ultra-detailed, realistic lighting, cinematic pacing, atmospheric depth, stable motion, coherent subject continuity, high-end science fiction grounded in physical realism.",
"negativePrompt": "cartoon, low resolution, blurry, jittery motion, flicker, duplicated objects, warped anatomy, extra limbs, distorted face, unstable camera, oversaturated colors, text, watermark, logo, frame glitches, abrupt cuts, chaotic action",
"width": 1920,
"height": 1080,
"duration": 10
}{
"taskType": "videoInference",
"taskUUID": "70b56006-db71-49e5-a2a1-acf3b771a505",
"videoUUID": "673e3e9c-26a2-4259-8fc3-03b1a783fb72",
"videoURL": "https://vm.runware.ai/video/os/a17d13/ws/5/vi/673e3e9c-26a2-4259-8fc3-03b1a783fb72.mp4",
"cost": 1.12
}Lantern Regatta at Daybreak
{
"taskType": "videoInference",
"taskUUID": "b47e5dd4-5c7d-4c61-9341-071e00f3b06a",
"model": "klingai:kling-video@3-pro",
"positivePrompt": "Using the supplied first-frame image as the opening shot, create a cinematic video of a dawn river regatta. The camera begins with a calm wide composition, then slowly glides forward over the water as the lantern boats drift apart in elegant patterns. The violinist at the dock lifts the bow and begins to play softly. Nearby lanterns bob and rotate, tiny reflections trembling across the river surface. A few birds cross the brightening sky, reeds sway gently at the shoreline, and thin morning mist gradually thins as sunlight warms the scene. Maintain the character design and overall composition from the reference image while adding subtle, believable motion and rich environmental detail. Include synchronized natural audio: quiet river water, wood creaks from the dock, distant birdsong, soft fabric rustle, and a delicate solo violin melody that feels live and intimate.",
"negativePrompt": "low quality, flicker, warped anatomy, duplicate people, extra limbs, distorted hands, abrupt camera shake, oversaturated colors, text, watermark, logo, heavy motion blur, noisy audio, robotic music, harsh cuts",
"duration": 8,
"inputs": {
"frameImages": [
{
"image": "https://assets.runware.ai/assets/inputs/5e3433ef-b14e-4c65-8d25-54ecf737c589.jpg",
"frame": "first"
}
]
}
}{
"taskType": "videoInference",
"taskUUID": "b47e5dd4-5c7d-4c61-9341-071e00f3b06a",
"videoUUID": "b35955ce-800a-4a73-885c-a25bdff89cd4",
"videoURL": "https://vm.runware.ai/video/os/a04d20/ws/5/vi/b35955ce-800a-4a73-885c-a25bdff89cd4.mp4",
"cost": 0.896
}Copper Aviary Dawn Tableau
{
"taskType": "videoInference",
"taskUUID": "b8f22ea1-2020-4969-8d7c-caca97c55373",
"model": "klingai:kling-video@3-pro",
"positivePrompt": "Create a cinematic video that begins from the first guided frame and evolves naturally toward the last guided frame. The scene takes place in a grand glass aviary at sunrise, with elegant handcrafted mechanical birds gradually waking, tilting their heads, fluttering open articulated wings, hopping from perch to perch, then lifting into coordinated spirals above a calm caretaker in a teal coat. Preserve the architecture, lighting direction, and subject continuity between the guided frames. Use a slow opening with subtle ambient movement, then build into graceful layered flight with rich depth, drifting dust, soft lens bloom, realistic metal reflections, gentle cloth movement, and synchronized naturalistic sound design: creaking perches, light wing whirs, faint gear clicks, echoing flutter, glass resonance, and warm morning air. The pacing should feel lyrical and immersive, with smooth camera drift and strong temporal coherence.",
"negativePrompt": "low detail, broken anatomy, extra limbs, warped birds, duplicated subjects, flicker, abrupt cuts, inconsistent architecture, oversaturated colors, text, watermark, logo, blurry caretaker, chaotic camera shake, horror tone, modern electronics, urban skyline",
"duration": 8,
"inputs": {
"frameImages": [
{
"image": "https://assets.runware.ai/assets/inputs/5ce13305-faae-42ef-9a34-40150a2e3ae8.jpg",
"frame": "first"
},
{
"image": "https://assets.runware.ai/assets/inputs/49f83de5-ff24-442e-8e70-85ba7ed09e5a.jpg",
"frame": "last"
}
]
}
}{
"taskType": "videoInference",
"taskUUID": "b8f22ea1-2020-4969-8d7c-caca97c55373",
"videoUUID": "06452bee-c048-46c5-9be9-8ca67f161069",
"videoURL": "https://vm.runware.ai/video/os/a04d20/ws/5/vi/06452bee-c048-46c5-9be9-8ca67f161069.mp4",
"cost": 0.896
}