---
title: Directing motion in image-to-video prompts — Runway Gen-4.5 | Runware Docs
url: https://runware.ai/docs/models/runway-gen-4-5/guides/directing-motion
description: How to write Gen-4.5 image-to-video prompts that direct motion instead of redescribing the scene. Covers the camera and subject channels, naming common camera moves, and layering atmospheric motion on top.
---
### [Introduction](https://runware.ai/docs/models/runway-gen-4-5/guides/directing-motion#introduction)

Visual consistency is the hardest part of text-to-video. Every roll picks a different region of the latent space and produces a different scene: a different car on a different cliff each time you regenerate. **Locking the visual identity** of a clip means starting each generation from a fixed still image and letting the prompt direct only the motion.

Gen-4.5 works exactly this way. You pass an image and a prompt. The image fixes the subject, the composition, the color palette, the lighting. The prompt's only job is to describe **how the frame should evolve** over the next few seconds.

[Watch video](https://runware.ai/docs/assets/output-hero.CIRIqJK4.mp4)

> **Prompt**: Slow cinematic push-in toward the figure at the lake's edge. Mist drifts continuously across the water from left to right. The golden sunrise light shifts subtly across the distant snow-tipped peaks. The figure holds steady, a silhouette against the still water. Otherwise the composition holds.

That clip started from a still photograph of a figure at a foggy alpine lake at sunrise. The prompt described five seconds of subtle motion: a forward push, drifting mist, light shifting across the distant peaks, the figure holding steady. The model produced exactly that **without inventing a new scene around the figure**.

This guide covers the request shape, the **two channels** every prompt directs (the subject and the camera), the cinematic vocabulary the model understands, and how to layer atmospheric motion on top.

> [!NOTE]
> The techniques in this guide apply to the entire Runway Gen-4 family on Runware. [Gen-4 Turbo](https://runware.ai/docs/models/runway-gen-4-turbo) is the cheaper, faster option for iteration and previsualization. The [Gen-4 image models](https://runware.ai/docs/models/runway-gen-4-image) cover the still-image step before either video model.

### [Request shape](https://runware.ai/docs/models/runway-gen-4-5/guides/directing-motion#request-shape)

A Gen-4.5 request takes one source image, one motion prompt, and a small set of dimensional parameters:

**Request**:

```json
[
  {
    "taskType": "videoInference",
    "taskUUID": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "model": "runway:1@2",
    "inputs": {
      "frameImages": [
        { "image": "https://example.com/still.jpg", "frame": "first" }
      ]
    },
    "positivePrompt": "Slow push-in toward the subject. Steam rises continuously from the wok.",
    "width": 1280,
    "height": 720,
    "duration": 5
  }
]
```

**Response**:

```json
{
  "data": [
    {
      "taskType": "videoInference",
      "taskUUID": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "videoUUID": "f1e2d3c4-b5a6-7890-1234-567890abcdef",
      "videoURL": "https://vm.runware.ai/video/os/a14d18/ws/2/vi/f1e2d3c4-b5a6-7890-1234-567890abcdef.mp4"
    }
  ]
}
```

The required fields:

- `inputs.frameImages` takes **exactly one image**, marked as the `first` frame of the output. Accepts a public URL, base64 string, data URI, or a UUID from a previous generation or the [Image Upload API](https://runware.ai/docs/platform/image-upload) . The image fixes the scene.
- `positivePrompt` describes the motion (1 to 1000 characters). It is **not** a scene description. See [Writing for motion](https://runware.ai/docs/models/runway-gen-4-5/guides/directing-motion#writing-for-motion-not-the-scene) below.
- `width` and `height` must be one of the model's allowed pairs: `1280 × 720`, `720 × 1280`, `1104 × 832`, `832 × 1104`, or `960 × 960`. Output dimensions are independent from the source image. Pick the pair that matches your source's aspect ratio.

The optional fields:

- `duration` must be `5`, `8`, or `10` seconds. Defaults to `10`. Longer clips cost proportionally more.
- `seed` for reproducible output.
- `providerSettings.runway.contentModeration` for tuning safety thresholds.

### [Writing for motion, not the scene](https://runware.ai/docs/models/runway-gen-4-5/guides/directing-motion#writing-for-motion-not-the-scene)

The instinct from text-to-image and text-to-video is to **describe what's in the frame**. With image-to-video that instinct backfires: the frame is already locked, and re-describing the scene gives the model nothing to do. The prompt's only useful job is to direct how the locked scene **evolves over time**.

Both clips below start from the same image of a campfire in a forest clearing. The difference is the prompt.

![Bright orange campfire burning in a stone fire ring at the centre of a forest clearing, deep blue dusk sky and pine trees in the background](https://runware.ai/docs/assets/source-campfire.C6PTyIgt_Z1HTLTP.jpg)

*Reference image*

> **Prompt**: A campfire of bright orange flames burning in a stone fire ring at the centre of a forest clearing at twilight, warm firelight catching the surrounding pine trees and the rocks of the ring, deep blue dusk sky above with the first stars appearing, a few split logs stacked to one side, distant mountains visible through gaps in the trees. Cinematic photorealistic outdoor photography, atmospheric composition, fine grain.

[Watch video](https://runware.ai/docs/assets/output-scene-prompt.PbvzZmaD.mp4)

*Scene-describing prompt*

> **Prompt**: A campfire in a stone fire ring in a forest clearing at twilight. Pine trees surround the clearing. A deep blue dusk sky above with the first stars. Split logs stacked to one side. Distant mountains visible through the trees. Cinematic photorealistic outdoor photography.

[Watch video](https://runware.ai/docs/assets/output-motion-prompt.CmD7YjJB.mp4)

*Motion-directing prompt*

> **Prompt**: Steady push-in toward the campfire. The flames leap and dance vigorously, climbing up from the logs. Bright orange sparks fly up into the night sky in continuous streams. Thick smoke curls upward and drifts across the upper half of the frame. The embers at the base pulse with shifting orange light. The pine trees behind the fire sway gently in the warm draft.

The first prompt **redescribes the image**. It tells the model what the scene IS, which the model already has from the input. The model has no instruction about what should change, so it produces something close to a static loop with mild incidental motion.

The second prompt **names specific motion** in every channel: the camera pushes in, the flames dance up from the logs, the sparks stream into the sky, the smoke curls across the upper frame, the trees sway in the warm draft. The model has clear direction and the result is a cinematic five seconds.

A useful rule: if a sentence in your prompt describes something that's already visible in the source image, **delete it**.

### [Two channels: subject and camera](https://runware.ai/docs/models/runway-gen-4-5/guides/directing-motion#two-channels-subject-and-camera)

The subject is what's alive inside the frame: anything that can move on its own. The camera is how the frame itself moves. **The two are independent**, and the model handles them separately.

Same source image, same five seconds, different channels active:

![Close portrait of an elderly Black jazz saxophonist playing a tenor saxophone in deep stage lighting](https://runware.ai/docs/assets/source-musician.CvFqrsW2_Z8iA3v.jpg)

*Reference image*

> **Prompt**: A close editorial portrait of an elderly Black jazz saxophonist with weathered features and a trimmed white beard, eyes half-closed in concentration, holding a tarnished brass tenor saxophone to his lips, a deep purple stage backlight and a single warm key light from above, fine grain, photorealistic high-contrast portrait, dark background, shallow depth of field.

[Watch video](https://runware.ai/docs/assets/output-subject-motion.oH7a5HEW.mp4)

*Subject channel only*

> **Prompt**: The saxophonist gently sways with the music. His chest rises and falls with measured breath. His fingers settle on the keys. A subtle nod of the head. Eyes stay closed in concentration. The camera holds completely still.

[Watch video](https://runware.ai/docs/assets/output-camera-only.yHLKJU5P.mp4)

*Camera channel only*

> **Prompt**: Slow push-in toward the saxophonist. He remains completely still throughout, no breathing, no movement. Only the camera moves forward, tightening the frame around his face and the brass of the saxophone.

Left: **the subject moves, the camera holds**. The frame composition stays put while the saxophonist breathes and sways.

Right: **the camera moves, the subject holds**. The frame tightens around the musician while he stays absolutely still.

Both clips are useful in production. The subject-only version is the standard **"alive portrait"** for ad creative and avatars. The camera-only version suggests **weight and importance** without the subject distracting from the move.

**Combining both channels** gives you the most common production shot: a slow push-in on a subject who is alive in the frame.

[Watch video](https://runware.ai/docs/assets/output-combined.CBvFKqSF.mp4)

> **Prompt**: Slow push-in toward the saxophonist while he gently sways with the music. His chest rises and falls with measured breath. His fingers settle on the keys. A subtle nod of the head. Eyes stay closed in concentration. The camera moves forward steadily across the five seconds.

The saxophonist breathes and sways while the camera pushes forward. Each channel reinforces the other: the **breath grounds the subject in time**, the **push-in moves the viewer's attention** toward the face. This is the default shot grammar for portraits in film and advertising.

### [Camera control vocabulary](https://runware.ai/docs/models/runway-gen-4-5/guides/directing-motion#camera-control-vocabulary)

Cinematic move names work on Gen-4.5 the way they work in a film script. Use the **explicit term** and you get the move. Describe it vaguely and the model interprets.

The most reliable camera move names:

- **Push-in** / **push toward:** camera moves forward, framing tightens
- **Pull-back** / **pull out:** camera moves backward, framing widens
- **Pan left** / **pan right:** camera rotates horizontally, sweeping across the scene
- **Tilt up** / **tilt down:** camera rotates vertically
- **Dolly left** / **dolly right:** camera slides sideways while staying parallel to the subject
- **Orbit** / **circle around:** camera rotates around a subject at a fixed distance
- **Crane up** / **crane down:** camera moves vertically while pointing at the same subject

Five of these moves applied to source images:

**Push**:

[Watch video](https://runware.ai/docs/assets/output-camera-push.BAekCOvm.mp4)

*Push-in: slow forward move, framing tightens around the Mustang*

> **Prompt**: Slow continuous push-in toward the Mustang. Framing tightens around the car as the camera moves forward. No other motion in the scene.

**Pan**:

[Watch video](https://runware.ai/docs/assets/output-camera-pan.C6Ds6D1s.mp4)

*Pan right: horizontal sweep along the coastline*

> **Prompt**: Slow horizontal pan from left to right along the coastline. The Mustang stays in roughly the same position within the frame. The cliff edge and distant headlands reveal as the camera sweeps. No subject motion.

**Tilt**:

[Watch video](https://runware.ai/docs/assets/output-camera-tilt.BY9-P9vZ.mp4)

*Tilt up: vertical rotation revealing the sky*

> **Prompt**: Slow vertical tilt upward. The camera starts framed on the Mustang and the cliff road at the bottom of the frame, then rotates its angle upward smoothly to reveal the open sky and the thin streaks of high cloud above. No subject motion.

**Dolly**:

[Watch video](https://runware.ai/docs/assets/output-camera-dolly.RfDpZQ0R.mp4)

*Dolly right: lateral slide past the subject*

> **Prompt**: Slow lateral dolly from left to right. The camera slides sideways at a fixed distance, staying parallel to the Mustang. The car maintains the same orientation within the frame as the cliff and headlands shift beside it. No subject motion.

**Orbit**:

[Watch video](https://runware.ai/docs/assets/output-camera-orbit.CbkNu1f0.mp4)

*Orbit: rotational arc around the parked car*

> **Prompt**: Slow continuous orbit around the Mustang. The camera arcs at a fixed distance from the front-left of the car to the front-right, the car holds completely still in the centre of the frame, and the cliff edge and distant headlands shift behind it as the camera rotates. No subject motion.

**Always state what doesn't move.** The clips above all end their prompts with a phrase like "no subject motion" or "the car holds completely still." Without that anchor the model often adds incidental subject motion you didn't ask for. **Naming the negative** locks the channel to zero.

### [Pacing the motion](https://runware.ai/docs/models/runway-gen-4-5/guides/directing-motion#pacing-the-motion)

The same camera move at a different pace lands as a completely different shot. A slow push-in feels deliberate and weighty. A rapid push-in feels urgent or aggressive. Pacing is the **second dial** every camera move has, after the move's name itself.

Same source, same camera move, two pacing words:

[Watch video](https://runware.ai/docs/assets/output-camera-push.BAekCOvm.mp4)

*Slow push-in*

> **Prompt**: Slow continuous push-in toward the Mustang. Framing tightens around the car as the camera moves forward. No other motion in the scene.

[Watch video](https://runware.ai/docs/assets/output-pacing-rapid.DSWNFClJ.mp4)

*Rapid push-in*

> **Prompt**: Rapid aggressive push-in toward the Mustang. The camera accelerates sharply forward and the framing snaps tight around the car within the first two seconds, then holds. No other motion in the scene.

The slow version reads as **romantic or reflective**. The rapid version reads as **tension building toward a moment**. Both are useful in production. The pacing word picks one.

Useful pacing words to know:

- **Slow**, **gradual**, **steady:** measured, cinematic
- **Subtle**, **gentle:** almost imperceptible drift
- **Rapid**, **fast:** urgent, kinetic
- **Sudden**, **dramatic:** startling, accelerated
- **Continuous:** steady throughout, no acceleration or stop

A camera move without a pacing word produces something average. Specifying the pace gives you the **cinematic register** you want.

### [Atmospheric motion](https://runware.ai/docs/models/runway-gen-4-5/guides/directing-motion#atmospheric-motion)

There's a third source of motion the model picks up automatically when you prompt for it: **the environment**. Rain, smoke, steam, mist, water reflections, hair, fabric, flame, flickering neon. None of these require a subject channel or a camera channel. They're motion that lives in the scene itself.

![Narrow city street at night with neon signs reflecting in wet pavement and a hooded figure walking away from camera](https://runware.ai/docs/assets/source-rainy.DD6sT-em_24k3cx.jpg)

*Reference image*

> **Prompt**: A narrow city street at night during a light rain, neon signs in pink and teal reflecting on the wet asphalt, a single hooded figure walking away from camera in the middle distance, soft puddles between the cobblestones, hanging electrical cables overhead. Cinematic street photography, photorealistic, moody late-night atmosphere.

[Watch video](https://runware.ai/docs/assets/output-atmosphere.T7RRcsHl.mp4)

> **Prompt**: Light rain falls continuously across the frame. Neon signs flicker softly with a slow rhythm. Reflections on the wet pavement ripple gently as drops land in the puddles. Faint mist drifts past the lower edge of the frame. The hooded figure walks away from camera at a slow even pace. The camera holds still.

The camera holds. The figure walks at a slow even pace. Everything else (the rain, the neon, the reflections, the mist) is atmospheric motion the model **infers from what's already in the image**. Wet pavement implies droplets. Neon implies flicker. Naming each one explicitly makes the model commit rather than guess.

Atmospheric motion is often the **cheapest "alive" effect** in image-to-video. A locked camera and a still subject can still produce a five-second clip that feels cinematic when the atmosphere is doing real work.

### [When the prompt fights the image](https://runware.ai/docs/models/runway-gen-4-5/guides/directing-motion#when-the-prompt-fights-the-image)

The model can only animate motion that has **visual evidence in the source**. Ask for something the image can't support and the model either ignores the instruction or produces a degraded result.

The three most common contradictions:

- **Asking for body motion that needs off-frame parts.** A head-and-shoulders portrait can't show the subject standing up and walking away. Nothing below the shoulders is in the source for the model to animate.
- **Adding elements that aren't in the frame.** "A bird flies past the camera" works if the source shows sky and open space. It won't work if the source is a tight interior shot with no visible window.
- **Asking for actions that contradict the image's state.** A car parked on a hill won't believably roll uphill. The model's prior is anchored to physical plausibility.

The musician portrait below is paired with a prompt that asks for full-body motion. The first frame can't support the request:

![Close portrait of an elderly Black jazz saxophonist playing a tenor saxophone in deep stage lighting](https://runware.ai/docs/assets/source-musician.CvFqrsW2_Z8iA3v.jpg)

*Reference image*

> **Prompt**: A close editorial portrait of an elderly Black jazz saxophonist with weathered features and a trimmed white beard, eyes half-closed in concentration, holding a tarnished brass tenor saxophone to his lips, a deep purple stage backlight and a single warm key light from above.

[Watch video](https://runware.ai/docs/assets/output-failure.CrbmywsJ.mp4)

*Output*

> **Prompt**: The saxophonist abruptly stands up from his seat, lowers his saxophone to his side, and walks out of frame to the right. The empty stage remains in shot.

The saxophonist holds his head and shoulders position. **No body or floor exists** in the source for the model to animate, so the "stands up and walks out" instruction effectively drops. The output is closer to a static loop than the directed motion.

When you need motion that requires off-frame elements, generate or shoot a wider source image first. The motion you can direct is **bounded by what's visible** in the first frame.

### [Tips](https://runware.ai/docs/models/runway-gen-4-5/guides/directing-motion#tips)

1. **Describe what moves, not what's there.** The image already shows the scene. The prompt's only useful content is the temporal evolution.
    
2. **Name camera moves explicitly.** "Slow push-in," "pan right," "orbit around the subject." Cinematic vocabulary is more reliable than abstract direction.
    
3. **Separate the subject channel from the camera channel.** When you want one to hold still while the other moves, state it: "the camera holds completely still," "the subject remains motionless." Without an explicit hold, the model often animates both.
    
4. **Pace your motion.** "Slow," "gradual," "steady," "rapid," "subtle," "dramatic." A push-in without a pace defaults to something average. Specifying gives you the cinematic register you want.
    
5. **Match motion to the image.** Don't direct motion that fights what's in the frame. A locked door won't open believably. A static stone wall won't sway. The model performs best when the prompted motion has visual evidence in the source.
    
6. **Layer atmosphere on top of the main motion.** Even a static composition feels alive with a couple of atmospheric details: mist drifting, neon flickering. They cost nothing in the prompt and a lot in perceived production value.