Wan2.5-Preview
Wan2.5-Preview AI Text to Video with Native Audio

Wan2.5-Preview is Alibaba’s multimodal video model in research preview. It supports text to video and image to video with native audio generation for clips around 10 seconds. It offers strong prompt adherence, smooth motion, and multilingual audio for narrative scenes.
Commercial use
Each generation will cost $0.0946/s for 720p, or $0.1476/s for 1080p.
720p · 5s$0.473
1080p · 5s$0.738
text-to-videoimage-to-videoaudio-to-video
Examples

















