MODEL ID alibaba:qwen@3.5-397b
coming-soon

Qwen3.5-397B

Alibaba
by Alibaba

Qwen3.5-397B is a frontier Qwen large language model for reasoning, coding, search, and agent workflows. The underlying open-weight flagship uses a sparse MoE design with 397B total parameters and 17B activated parameters, supports 262K native context extensible to about 1M tokens, and is designed for high-throughput long-context inference.

Qwen3.5-397B

API Options

Platform-level options for task execution and delivery.

taskType

string required value: textInference

Identifier for the type of task being performed

taskUUID

string required UUID v4

UUID v4 identifier for tracking tasks and matching async responses. Must be unique per task.

webhookURL

string URI

Specifies a webhook URL where JSON responses will be sent via HTTP POST when generation tasks complete. For batch requests with multiple results, each completed item triggers a separate webhook call as it becomes available.

Learn more 1 resource

deliveryMethod

string default: sync

Determines how the API delivers task results.

Allowed values 3 values
Returns complete results directly in the API response.
Returns an immediate acknowledgment with the task UUID. Poll for results using getResponse.
Streams results token-by-token as they are generated.
Learn more 1 resource

includeCost

boolean default: false

Include task cost in the response.

includeUsage

boolean default: false

Include token usage statistics in the response.

numberResults

integer min: 1 max: 4 default: 1

Number of results to generate. Each result uses a different seed, producing variations of the same parameters.

Inputs

Input resources for the task (images, audio, etc). These must be nested inside the inputs object.

inputs » images

images

array of strings min items: 1

Array of image inputs (UUID, URL, Data URI, or Base64).

inputs » videos

videos

array of strings min items: 1

Array of video inputs (UUID, URL, or Base64).

Generation Parameters

Core parameters for controlling the generated content.

model

string required value: alibaba:qwen@3.5-397b

Identifier of the model to use for generation.

Learn more 3 resources

seed

integer min: 0 max: 4294967295

Random seed for reproducible generation. When not provided, a random seed is generated in the unsigned 32-bit range.

messages

array of objects required min items: 1

Array of chat messages forming the conversation context.

Properties 2 properties
messages » role

role

string required

The role of the message author.

Allowed values 2 values
messages » content

content

string required min: 1

The text content of the message.

Settings

Technical parameters to fine-tune the inference process. These must be nested inside the settings object.

settings » systemPrompt

systemPrompt

string min: 1 max: 262144

System-level instruction that guides the model's behavior and output style across the entire generation.

settings » temperature

temperature

float min: 0 max: 2 step: 0.01 default: 0.6

Controls randomness in generation. Lower values produce more deterministic outputs, higher values increase variation and creativity.

settings » topP

topP

float min: 0 max: 1 step: 0.01 default: 0.95

Nucleus sampling parameter that controls diversity by limiting the probability mass. Lower values make outputs more focused, higher values increase diversity.

settings » frequencyPenalty

frequencyPenalty

float min: 0 max: 2 step: 0.01 default: 0

Penalizes tokens based on their frequency in the output so far. A value of 0.0 disables the penalty.

settings » maxTokens

maxTokens

integer min: 1 max: 65536

Maximum number of tokens to generate in the response.

settings » minP

minP

float min: 0 max: 1 step: 0.01 default: 0

Minimum probability threshold. Tokens with probability below this value are excluded from sampling.

settings » presencePenalty

presencePenalty

float min: 0 max: 2 step: 0.01 default: 0

Encourages the model to introduce new topics. A value of 0.0 disables the penalty.

settings » repetitionPenalty

repetitionPenalty

float min: 0 max: 2 step: 0.01 default: 1

Penalizes tokens that have already appeared in the output. A value of 1.0 disables the penalty.

settings » stopSequences

stopSequences

array of strings min: 1

Array of sequences that will cause the model to stop generating further tokens when encountered.

settings » thinkingLevel

thinkingLevel

string default: high

Controls the depth of internal reasoning the model performs before generating a response.

Allowed values 2 values
settings » topK

topK

integer min: 1 max: 100 default: 20

Top-K sampling parameter that limits the number of highest-probability tokens considered at each step.