
Grok Imagine Video 1.5 Preview
Higher-tier Grok image-to-video generation from a single starting frame with longer durations and stronger output quality
Grok Imagine Video 1.5 Preview
Higher-tier Grok image-to-video generation from a single starting frame with longer durations and stronger output quality
Grok Imagine Video 1.5 Preview Overview
Grok Imagine Video 1.5 Preview is xAI's newer preview image-to-video model. It is positioned above the earlier Grok Imagine Video release with higher per-second pricing, supports durations up to 15 seconds, and generates 480p or 720p video from a single still-image starting frame for cinematic clips, animated visuals, and prompt-guided short-form video creation.
Commercial use
More models from xAI
Grok Imagine Image Quality is xAI's quality-focused image generation and editing model. It is designed for higher realism, stronger multilingual text rendering, tighter prompt following, deeper scene understanding, and more consistent brand-oriented output across both text-to-image and image editing workflows.
Grok 4.3 is xAI's flagship language model for agentic reasoning, strong instruction following, and minimal hallucinations. It supports text and image input, a 1 million token context window, configurable reasoning effort including non-reasoning mode, function calling, and structured outputs for production assistants, coding workflows, and long-context analysis.
xAI Text-to-Speech converts text into natural-sounding spoken audio with a single API call. It offers five expressive voices (Eve, Ara, Leo, Rex, and Sal), inline speech tags for fine-grained control over pauses, laughter, whispers, and emphasis, and supports over 20 auto-detected languages.
Grok Imagine Image is a multimodal generative image model that creates high-quality still images from text prompts or image inputs. It supports flexible visual synthesis across a range of styles, enabling developers to generate creative imagery directly from structured prompts or to expand on existing visuals with coherent, detailed outputs.
Grok Imagine Video is a multimodal generative video model that produces short video clips with native audio from text descriptions or static images. It supports text-to-video and image-to-video generation with synchronized sound effects and dialogue, enabling developers to animate scenes with motion, camera dynamics, and audio in a single API workflow.




