GLM-5.1
GLM-5.1 is Z.ai’s flagship language model for agentic engineering, coding, reasoning, and tool-driven workflows. It supports a 200K token context window with up to 128K output tokens, deep thinking, function calling, structured output, and streaming tool calls, and is designed to stay effective over long multi-step sessions rather than only short-horizon tasks.
API Options
Platform-level options for task execution and delivery.
-
taskType
string required value: textInference -
Identifier for the type of task being performed
-
taskUUID
string required UUID v4 -
UUID v4 identifier for tracking tasks and matching async responses. Must be unique per task.
-
webhookURL
string URI -
Specifies a webhook URL where JSON responses will be sent via HTTP POST when generation tasks complete. For batch requests with multiple results, each completed item triggers a separate webhook call as it becomes available.
Learn more 1 resource
- Webhooks PLATFORM
- Webhooks
-
deliveryMethod
string default: sync -
Determines how the API delivers task results.
Allowed values 3 values
- Returns complete results directly in the API response.
- Returns an immediate acknowledgment with the task UUID. Poll for results using getResponse.
- Streams results token-by-token as they are generated.
Learn more 1 resource
- Task Polling PLATFORM
-
includeCost
boolean default: false -
Include task cost in the response.
-
includeUsage
boolean default: false -
Include token usage statistics in the response.
-
numberResults
integer min: 1 max: 4 default: 1 -
Number of results to generate. Each result uses a different seed, producing variations of the same parameters.
Generation Parameters
Core parameters for controlling the generated content.
-
model
string required value: zai:glm@5.1 -
Identifier of the model to use for generation.
Learn more 3 resources
-
seed
integer min: 0 max: 9223372036854776000 -
Random seed for reproducible generation. When not provided, a random seed is generated in the unsigned 32-bit range.
Learn more 1 resource
-
messages
array of objects required min items: 1 -
Array of chat messages forming the conversation context.
Settings
Technical parameters to fine-tune the inference process. These must be nested inside the settings object.
settings object.-
settings»systemPromptsystemPrompt
string min: 1 max: 200000 -
System-level instruction that guides the model's behavior and output style across the entire generation.
-
settings»temperaturetemperature
float min: 0 max: 1 step: 0.01 default: 1 -
Controls randomness in generation. Lower values produce more deterministic outputs, higher values increase variation and creativity.
-
settings»topPtopP
float min: 0.01 max: 1 step: 0.01 -
Nucleus sampling parameter that controls diversity by limiting the probability mass. Lower values make outputs more focused, higher values increase diversity.
-
settings»frequencyPenaltyfrequencyPenalty
float min: -2 max: 2 step: 0.01 -
Penalizes tokens based on their frequency in the output so far. A value of 0.0 disables the penalty.
-
settings»maxTokensmaxTokens
integer min: 1 max: 131072 default: 4096 -
Maximum number of tokens to generate in the response.
-
settings»presencePenaltypresencePenalty
float min: -2 max: 2 step: 0.01 -
Encourages the model to introduce new topics. A value of 0.0 disables the penalty.
-
settings»stopSequencesstopSequences
array of strings min: 1 max: 50 max items: 5 -
Array of sequences that will cause the model to stop generating further tokens when encountered.
-
settings»thinkingLevelthinkingLevel
string default: none -
Controls the depth of internal reasoning the model performs before generating a response.
Allowed values 4 values
-
settings»toolChoicetoolChoice
object -
Controls how the model selects which tool to call.
Properties 2 properties
-
settings»toolChoice»typetype
string required -
Tool selection strategy.
Allowed values 4 values
- Model decides whether to call a tool.
- Model must call at least one tool.
- Model must call the specified tool.
- Model must not call any tool.
-
settings»toolChoice»namename
string -
Name of the tool to call.
-
-
settings»toolstools
array of objects -
Function definitions available for the model to call. Each tool is a JSON Schema object describing the function signature.
-
settings»topKtopK
integer min: 1 max: 100647 -
Top-K sampling parameter that limits the number of highest-probability tokens considered at each step.