Grok Imagine Image Pro

High fidelity AI image generation and editing with improved prompt control

Text to ImageImage to ImageEdit

Launch model

xAI

Grok Imagine Image Pro

High fidelity AI image generation and editing with improved prompt control

Text to ImageImage to ImageEdit

Launch model

Grok Imagine Image Pro Overview

Grok Imagine Image Pro is the higher quality variant of the Grok Imagine image model developed by xAI. It generates detailed images from text prompts and supports iterative editing of existing images through natural language instructions. The model provides stronger prompt adherence, improved rendering quality, and more reliable control over composition, style, and aspect ratio. It supports multiple image styles and resolutions up to 2K, enabling workflows for design, illustration, and creative prototyping.

From $0.0700/ image

Image gen$0.07

Editing$0.072

Commercial use

More models from xAI

Launch View details

Grok Imagine Image Quality

Grok Imagine Image Quality is xAI's quality-focused image generation and editing model. It is designed for higher realism, stronger multilingual text rendering, tighter prompt following, deeper scene understanding, and more consistent brand-oriented output across both text-to-image and image editing workflows.

View details

Coming Soon

Grok 4.3

Grok 4.3 is xAI's flagship language model for agentic reasoning, strong instruction following, and minimal hallucinations. It supports text and image input, a 1 million token context window, configurable reasoning effort including non-reasoning mode, function calling, and structured outputs for production assistants, coding workflows, and long-context analysis.

Launch View details

xAI Text-to-Speech

xAI Text-to-Speech converts text into natural-sounding spoken audio with a single API call. It offers five expressive voices (Eve, Ara, Leo, Rex, and Sal), inline speech tags for fine-grained control over pauses, laughter, whispers, and emphasis, and supports over 20 auto-detected languages.

Launch View details

Grok Imagine Image

Grok Imagine Image is a multimodal generative image model that creates high-quality still images from text prompts or image inputs. It supports flexible visual synthesis across a range of styles, enabling developers to generate creative imagery directly from structured prompts or to expand on existing visuals with coherent, detailed outputs.

Launch View details

Grok Imagine Video

Grok Imagine Video is a multimodal generative video model that produces short video clips with native audio from text descriptions or static images. It supports text-to-video and image-to-video generation with synchronized sound effects and dialogue, enabling developers to animate scenes with motion, camera dynamics, and audio in a single API workflow.