
Seedream 4.5
Seedream 4.5 high fidelity multi reference text to image model
Seedream 4.5
Seedream 4.5 high fidelity multi reference text to image model









Commercial use
How to Use Seedream 4.5
Overview
Seedream 4.5 is a versatile image generation and editing model from ByteDance, designed for workflows that require structured prompting, consistent composition, and controlled visual output. It supports both text-to-image generation and image-guided refinement, making it suitable for creative exploration as well as iterative design and production use.
The model performs particularly well when prompts describe layout, subject relationships, and stylistic intent in a clear, layered way. This makes Seedream 4.5 a strong choice for concept visuals, marketing assets, editorial imagery, and design-oriented content where predictability and control matter.
How it Works
Seedream 4.5 combines language understanding with image synthesis and editing capabilities, allowing it to generate new visuals or refine existing ones based on detailed instructions.
Prompt Interpretation
The model analyses prompts to understand subjects, composition, spatial relationships, style, and constraints. It responds best to prompts that describe scenes in logical layers rather than relying on short or abstract descriptions.
Image Generation
Seedream 4.5 produces images with stable structure, balanced composition, and consistent visual detail. It supports a wide range of styles, from clean illustrations to more realistic scenes, depending on how the prompt is framed.
Image Editing & Refinement
The model supports image-to-image workflows, enabling targeted edits such as adjusting objects, modifying backgrounds, or refining visual details while preserving the rest of the image.
Key Features
-
Structured Prompt Handling
Performs well with prompts that describe layout, hierarchy, and visual intent step by step. -
Image Editing Support
Allows controlled refinement of existing images through image-to-image workflows. -
Consistent Composition
Maintains subject placement and overall structure across variations. -
Style Flexibility
Handles both realistic and stylised outputs within a single model. -
Design-Oriented Output
Well suited for iterative creative workflows where predictability is important.
Technical Specifications
- Model Name: Seedream 4.5
- Model Type: Image generation and image editing
- Input: Text prompt with optional input image
- Editing Capabilities: Image-to-image transformations and targeted refinements
- Prompt Handling: Optimised for layered, descriptive prompts
- Provider: ByteDance
How to Use
- Write a prompt describing the scene, layout, and style in a clear, structured way.
- Optionally include an input image if you want to modify or refine existing visuals.
- Submit the request using the Seedream 4.5 model.
- Review the result and iterate by adjusting individual parts of the prompt.
Example Prompts
Structured scene composition
Create a modern café interior with large windows on the left, a long wooden counter running horizontally across the centre, and seating arranged along the right side. Use soft natural daylight, neutral colours, and a calm, minimal atmosphere. Keep the background uncluttered and the composition balanced.
Product-focused visual
Generate a clean product image of a wireless speaker placed on a simple pedestal. Centre the product, use soft studio lighting, and keep the background minimal with a subtle gradient. Emphasise material texture and clean edges without dramatic shadows.
Illustrative style
Illustrate a small urban park from a slightly elevated viewpoint, with simplified buildings in the background and stylised trees in the foreground. Use flat colours, gentle shading, and a clear visual hierarchy that separates foreground, midground, and background.
Image editing / refinement
Using the provided image, replace the background with a light neutral studio backdrop. Keep the subject unchanged, maintain original proportions, and adjust lighting so the subject remains evenly lit and visually consistent with the new background.
Layout-driven prompt
Design a square-format image with a large central illustration, leaving clear empty space at the top for a headline. Use a restrained colour palette and ensure the main subject does not overlap the text area.
Tips for Better Results
- Break prompts into logical parts: scene, layout, style, and constraints.
- Be explicit about what should stay the same when editing an image.
- Describe spatial relationships rather than relying on abstract style words.
- Add detail gradually to maintain control over the output.
Notes & Limitations
- Highly complex edits may require multiple iterations.
- Very subtle changes benefit from precise, focused instructions.
- Output quality depends on how clearly layout and intent are described.
Documentation
You can find full usage details, parameters, and examples here: https://runware.ai/docs/en/providers/bytedance
More models from ByteDance
Seedance 2.0 is a unified multimodal audio-video generation model from ByteDance that accepts text, image, audio, and video inputs in combination, supporting up to 9 images, 3 video clips, and 3 audio clips as reference. It generates multi-shot videos up to 15 seconds with dual-channel synchronized audio including dialogue, ambient sound, and effects. It features physics-aware motion, improved controllability for video extension and editing, and strong instruction following for complex scene composition.
Seedance 2.0 Fast is a speed-optimized variant of ByteDance's unified multimodal audio-video generation model. It accepts text, image, audio, and video inputs in combination, like Seedance 2.0, but targets shorter wall-clock times and higher throughput for iterative workflows. It produces multi-shot videos with dual-channel synchronized audio including dialogue, ambient sound, and effects, with physics-aware motion and editing controls, while prioritizing responsiveness over the last increment of visual refinement so teams can preview and ship ideas faster.
Seedream 5.0 Lite is an advanced image generation model from ByteDance that produces high-quality still images from text prompts while providing flexibility for editing workflows. It is designed to combine expressive creativity with precise control over layout, composition, styles, and details, interpreting nuanced instructions faithfully. Users can incorporate a single reference image to guide generation or editing. Integrated search and reasoning features let the model visualize real-time trends and domain information in the output.
Seedance 1.5 Pro is a next-generation AI video model from BytePlus that generates cinematic videos with native synchronized audio directly from text or image inputs. It offers precise audio-visual timing, strong motion coherence, expressive camera control, and advanced narrative prompt handling for short video creation.
ByteDance Video Upscaler boosts video resolution to 1080p, 2K, or 4K with advanced denoising and motion enhancement. It restores color, reduces compression artifacts, and improves clarity for legacy films, UGC clips, and short narrative content through a simple API.
Seedance 1.0 Pro Fast accelerates the core Seedance pipeline for expressive dance and performance clips. It turns text prompts or reference images into smooth, cinematic motion with strong temporal consistency. Ideal for rapid iteration in creative tools and production workflows.
Seedream 4.0 is ByteDance’s multimodal image model for fast 2K to 4K generation. It supports text prompts, image editing with natural language, and multi image reference. It maintains style consistency across batches and handles bilingual Chinese and English workflows.
OmniHuman-1.5 generates high fidelity avatar video from a single image with audio and optional text prompts. It fuses multimodal reasoning with diffusion motion to keep identity stable, lip sync accurate, and gestures context aware for long, multi subject clips.
Seedance 1.0 Pro is a ByteDance video model for 5 to 10 second clips at up to 1080p. It supports text prompts and image first frames. It delivers smooth motion with strong temporal consistency. Ideal for multi shot storytelling, ads, and design previews in real time pipelines.
OmniHuman-1 is a ByteDance research model for human video generation from a single image and motion signals like audio. It focuses on accurate lip sync, expressive motion, and strong generalization across portraits, full body shots, cartoons, and stylized avatars.









