Vidu Q2 Turbo
Faster Vidu Q2 video generation with advanced motion control
Vidu Q2 Turbo
Faster Vidu Q2 video generation with advanced motion control










Save on average 58% vs the market
Commercial use
More models from Vidu
Vidu Q3 Turbo is a speed-optimized multimodal video generation model that produces short video clips with synchronized audio directly from text or images. It prioritizes fast inference and responsive iteration while preserving stable motion, coherent composition, and reliable audio alignment, making it suitable for rapid prototyping and production workflows where latency is critical.
Vidu Q3 is a multimodal video generation model that creates video with synchronized audio directly from text or images, supports intelligent multi-shot sequencing, and produces complete outputs with stable visuals and embedded subtitles without post-processing.
Vidu Q2 Pro is a high fidelity video generation model for cinematic storytelling. It supports text prompts, image inputs, and multi reference control for long form scenes. It targets developers who need controllable motion, stable characters, and smooth camera work for complex shots.
Vidu Q1 Classic generates 1080p clips up to 16 seconds from text prompts, source images, or reference shots. It targets controllable motion and stable scenes for fast prototyping. Ideal for teams that need cinematic tests without complex video pipelines.
Vidu Q1 is a generative video model that preserves visual fidelity from multiple reference images. It supports character, scene and prop control with smooth transitions and 1080p clips. Ideal for ads, story sequences and animation workflows that need tight visual continuity.
Vidu Q1 (image) is a reference-to-image model designed for high visual fidelity. It blends multiple input images with consistent identity and style. Prompts can guide composition and layout without losing coherence. The model supports flexible aspect ratios for ads, social content, storyboards or animation assets. It produces clean visuals with minimal effort and is useful for rapid creative workflows.
Vidu 2.0 is a generative video model for rapid 1080p clip creation. It targets 4 second and 8 second shots with strong subject consistency and support for batch workflows. Developers can drive cinematic clips from text prompts and templates with improved speed and lower cost.
Vidu 1.5 is a multimodal text to video model that focuses on multi entity consistency across complex scenes. It keeps multiple characters and objects visually stable across frames and shots. Developers can build long form video workflows that need coherent motion and style control.







