Vidu Q3 Turbo

Low-latency multimodal video generation with native audio

Vidu Q3 Turbo

Vidu Q3 Turbo is a speed-optimized multimodal video generation model that produces short video clips with synchronized audio directly from text or images. It prioritizes fast inference and responsive iteration while preserving stable motion, coherent composition, and reliable audio alignment, making it suitable for rapid prototyping and production workflows where latency is critical.

Vidu
Commercial use
Text to VideoImage to Video
Pricing starts at $0.026/s at 540p, $0.039/s at 720p, and $0.052/s at 1080p.

Average savings vs typical market rates

540p · 5sSave ~35%$0.13
720p · 5sSave ~35%$0.195
1080p · 5sSave ~35%$0.26