Z.ai
Z.ai

GLM-Image

Hybrid autoregressive + diffusion image model with strong text rendering

Text to ImageImage to ImageEdit

GLM-Image Overview

GLM-Image is an open-source image generation model that combines an autoregressive image-token generator with a diffusion decoder to produce high-fidelity results with strong prompt adherence. It is especially strong at accurate text rendering inside images and knowledge-intensive compositions, and it supports image-to-image generation for instruction-driven edits within a single unified model.

From $0.0225/ image

Save on average 55% vs the market

1024x1024$0.0225

Commercial use

More models from Z.ai

GLM-5.1 is Z.ai’s flagship language model for agentic engineering, coding, reasoning, and tool-driven workflows. It supports a 200K token context window with up to 128K output tokens, deep thinking, function calling, structured output, and streaming tool calls, and is designed to stay effective over long multi-step sessions rather than only short-horizon tasks.

GLM-4.7

Coming Soon

GLM-4.7 is a 358 billion parameter Mixture-of-Experts language model from Z.ai optimized for agentic coding, complex reasoning, and long-horizon tasks. It features interleaved thinking, preserved thinking for multi-turn consistency, and turn-level thinking control. It supports a 200K token context window with 128K max output, tool calling, and achieves 73.8% on SWE-bench Verified.