lipsync-2

Zero-shot audio-driven lip synchronization for video

lipsync-2

lipsync-2 is a zero-shot lip synchronization model that aligns spoken audio to existing video while preserving the speaker’s identity and natural speaking style. It works across live-action, animation, and AI-generated footage without training or fine-tuning.

Commercial use

$0.0440 per second

Per second of audio$0.0440
Video To VideoAudio To Video