HeyGen

AI avatar video generation with expressive speech and lip sync

HeyGen develops avatar driven video generation models that produce photorealistic digital presenters with synchronized lip movements, natural facial expressions, and multilingual speech. Their models support text to speech with adjustable speed, pitch, and expressiveness across hundreds of voices and languages, as well as prompt based video creation through their Video Agent. As a Runware provider, HeyGen models are available for avatar video and agentic video generation through a single inference pipeline alongside other creators.

Models by HeyGen

Launch View details

HeyGen Avatar V

HeyGen Avatar V is an avatar video generation model for talking digital twins and other eligible registered avatar looks. It improves identity preservation, lip sync accuracy, facial expressiveness, and motion coherence across angle changes, scene changes, and long-form videos, making it well suited to presenter, training, and localization workflows where avatar stability matters.

Launch View details

HeyGen Video Agent

HeyGen Video Agent is an AI video production model that generates complete, multi-scene videos from a single text prompt. It automates the full production pipeline — scriptwriting, avatar selection, shot planning, B-roll integration, motion graphics, captions, and editing — producing broadcast-ready videos with consistent branding. The agent supports customizable avatars, voice cloning, and iterative editing without full regeneration, enabling scalable video content creation for marketing, training, and social media.

Launch View details

HeyGen Avatar IV

HeyGen Avatar IV is a photorealistic AI avatar generation model that creates talking videos from a single image and a script or audio input. The model synchronizes voice with facial motion, expressions, and gestures to produce lifelike avatar performances. It supports multilingual speech, realistic lip synchronization, and expressive body language, enabling scalable production of presenter-style videos without cameras, actors, or studio setups.