lipsync-2
Zero-shot audio-driven lip synchronization for video

lipsync-2 is a zero-shot lip synchronization model that aligns spoken audio to existing video while preserving the speaker’s identity and natural speaking style. It works across live-action, animation, and AI-generated footage without training or fine-tuning.
Commercial use
$0.0440 per second
Per second of audio$0.0440
video-to-videoaudio-to-videotext-to-audioaudio-to-audio