
GLM-5.1
Flagship agentic coding model with 200K context, deep thinking, and long-horizon task execution
GLM-5.1
Flagship agentic coding model with 200K context, deep thinking, and long-horizon task execution
GLM-5.1 Overview
GLM-5.1 is Z.ai’s flagship language model for agentic engineering, coding, reasoning, and tool-driven workflows. It supports a 200K token context window with up to 128K output tokens, deep thinking, function calling, structured output, and streaming tool calls, and is designed to stay effective over long multi-step sessions rather than only short-horizon tasks.
How to Use GLM-5.1
Overview
GLM-5.1 is a large language model built for coding, reasoning, and long-horizon agent workflows. It is positioned as the flagship text model in the current Z.ai family and is designed for sustained multi-step execution rather than only short interactive turns.
This makes it a strong fit for assistants that need to plan, call tools, revise strategies, and keep making progress across longer sessions.
Capabilities
Agentic Coding and Engineering Workflows
GLM-5.1 is designed for code generation, debugging, repository analysis, refactoring support, and broader engineering tasks that require iterative tool use and multi-step execution.
Long-Horizon Task Execution
A core focus of GLM-5.1 is maintaining effectiveness over extended runs. It is intended for workflows where the model needs to continue planning, executing, checking results, and revising its approach across many turns rather than plateauing after an initial pass.
Long Context and Large Outputs
The model supports up to 200K context and up to 128K output tokens. This makes it suitable for long documents, large codebases, multi-file reasoning, and tasks that need both large inputs and substantial generated outputs.
Deep Thinking
GLM-5.1 supports deep thinking mode and enables it by default. This is useful for complex coding, architecture, and reasoning tasks where extra deliberation improves output quality.
Function Calling and Tool Streaming
The model supports function calling and can stream tool call arguments during execution. This is useful for agent systems that need real-time tool invocation, progressive tool parameter construction, and tighter orchestration around external tools.
Structured Output
GLM-5.1 supports structured outputs such as JSON, which makes it easier to integrate into application logic, workflows, and downstream automation.
Input and Output
- AIR ID:
zai:[email protected] - Input: text
- Output: text
- Context length: 200K
- Maximum output: 128K tokens
- License: MIT
Typical Use Cases
- Agentic coding assistants
- Long-running tool-using workflows
- Repository and architecture analysis
- Structured generation for automation systems
- Complex reasoning over long inputs
More models from Z.ai
GLM-Image is an open-source image generation model that combines an autoregressive image-token generator with a diffusion decoder to produce high-fidelity results with strong prompt adherence. It is especially strong at accurate text rendering inside images and knowledge-intensive compositions, and it supports image-to-image generation for instruction-driven edits within a single unified model.
GLM-4.7 is a 358 billion parameter Mixture-of-Experts language model from Z.ai optimized for agentic coding, complex reasoning, and long-horizon tasks. It features interleaved thinking, preserved thinking for multi-turn consistency, and turn-level thinking control. It supports a 200K token context window with 128K max output, tool calling, and achieves 73.8% on SWE-bench Verified.

