---
title: Gemma 4 31B | Runware Docs
url: https://runware.ai/docs/models/google-gemma-4-31b
description: Open 31B multimodal reasoning model for coding, long context, and agentic workflows
---
# Gemma 4 31B

Gemma 4 31B is Google's flagship dense open-weights model in the Gemma 4 family. It combines strong reasoning, coding performance, native function calling, multimodal understanding across text, image, and video, and a 256K context window in a 31B-parameter open model designed for local and cloud deployment.

- **ID**: `google-gemma-4-31b`
- **Status**: api-only
- **Creator**: Google
- **Release Date**: April 2, 2026
- **Capabilities**: Text to Text, Image to Text, Video to Text

## Pricing

- **Input tokens / 1M**: `$0.12`
- **Output tokens / 1M**: `$0.37`
- **Cached input / 1M**: `$0.012`

## Request Parameters

**API Options**

Platform-level options for task execution and delivery.

### [taskType](https://runware.ai/docs/models/google-gemma-4-31b#request-tasktype)

- **Type**: `string`
- **Required**: true
- **Value**: `textInference`

Identifier for the type of task being performed

### [taskUUID](https://runware.ai/docs/models/google-gemma-4-31b#request-taskuuid)

- **Type**: `string`
- **Required**: true
- **Format**: `UUID v4`

UUID v4 identifier for tracking tasks and matching async responses. Must be unique per task.

### [outputFormat](https://runware.ai/docs/models/google-gemma-4-31b#request-outputformat)

- **Type**: `string`
- **Default**: `TEXT`

Specifies the file format of the generated output. The available values depend on the task type and the specific model's capabilities.

**Allowed values**: `TEXT`

### [webhookURL](https://runware.ai/docs/models/google-gemma-4-31b#request-webhookurl)

- **Type**: `string`
- **Format**: `URI`

Specifies a webhook URL where JSON responses will be sent via HTTP POST when generation tasks complete. For batch requests with multiple results, each completed item triggers a separate webhook call as it becomes available.

**Learn more** (1 resource):

- [Webhooks](https://runware.ai/docs/platform/webhooks) (platform)

### [deliveryMethod](https://runware.ai/docs/models/google-gemma-4-31b#request-deliverymethod)

- **Type**: `string`
- **Default**: `sync`

Determines how the API delivers task results.

**Allowed values**:

- `sync` Returns complete results directly in the API response.
- `async` Returns an immediate acknowledgment with the task UUID. Poll for results using getResponse.
- `stream` Streams results token-by-token as they are generated.

**Learn more** (1 resource):

- [Task Polling](https://runware.ai/docs/platform/task-polling) (platform)

### [includeCost](https://runware.ai/docs/models/google-gemma-4-31b#request-includecost)

- **Type**: `boolean`
- **Default**: `false`

Include task cost in the response.

### [includeUsage](https://runware.ai/docs/models/google-gemma-4-31b#request-includeusage)

- **Type**: `boolean`
- **Default**: `false`

Include token usage statistics in the response.

### [numberResults](https://runware.ai/docs/models/google-gemma-4-31b#request-numberresults)

- **Type**: `integer`
- **Min**: `1`
- **Max**: `4`
- **Default**: `1`

Number of results to generate. Each result uses a different seed, producing variations of the same parameters.

**Inputs**

Input resources for the task (images, audio, etc). These must be nested inside the \`inputs\` object.

### [images](https://runware.ai/docs/models/google-gemma-4-31b#request-inputs-images)

- **Path**: `inputs.images`
- **Type**: `array of strings`

Array of image inputs (UUID, URL, Data URI, or Base64).

### [videos](https://runware.ai/docs/models/google-gemma-4-31b#request-inputs-videos)

- **Path**: `inputs.videos`
- **Type**: `array of strings`

Array of video inputs (UUID, URL, or Base64).

**Core Parameters**

Primary parameters that define the task output.

### [model](https://runware.ai/docs/models/google-gemma-4-31b#request-model)

- **Type**: `string`
- **Required**: true
- **Value**: `google-gemma-4-31b`

Identifier of the model to use for generation.

### [seed](https://runware.ai/docs/models/google-gemma-4-31b#request-seed)

- **Type**: `integer`
- **Min**: `0`
- **Max**: `9223372036854776000`

Random seed for reproducible generation. When not provided, a random seed is generated in the unsigned 32-bit range.

### [messages](https://runware.ai/docs/models/google-gemma-4-31b#request-messages)

- **Path**: `messages.role`
- **Type**: `array of objects (2 properties)`
- **Required**: true

Array of chat messages forming the conversation context.

#### [role](https://runware.ai/docs/models/google-gemma-4-31b#request-messages-role)

- **Path**: `messages.role`
- **Type**: `string`
- **Required**: true

The role of the message author.

**Allowed values**: `user` `assistant`

#### [content](https://runware.ai/docs/models/google-gemma-4-31b#request-messages-content)

- **Path**: `messages.content`
- **Type**: `string`
- **Required**: true
- **Min**: `1`

The text content of the message.

**Settings**

Technical parameters to fine-tune the inference process. These must be nested inside the \`settings\` object.

### [systemPrompt](https://runware.ai/docs/models/google-gemma-4-31b#request-settings-systemprompt)

- **Path**: `settings.systemPrompt`
- **Type**: `string`
- **Min**: `1`
- **Max**: `50000`

System-level instruction that guides the model's behavior and output style across the entire generation.

### [temperature](https://runware.ai/docs/models/google-gemma-4-31b#request-settings-temperature)

- **Path**: `settings.temperature`
- **Type**: `float`
- **Min**: `0`
- **Max**: `2`
- **Step**: `0.01`

Controls randomness in generation. Lower values produce more deterministic outputs, higher values increase variation and creativity.

### [topP](https://runware.ai/docs/models/google-gemma-4-31b#request-settings-topp)

- **Path**: `settings.topP`
- **Type**: `float`
- **Min**: `0`
- **Max**: `1`
- **Step**: `0.01`

Nucleus sampling parameter that controls diversity by limiting the probability mass. Lower values make outputs more focused, higher values increase diversity.

### [frequencyPenalty](https://runware.ai/docs/models/google-gemma-4-31b#request-settings-frequencypenalty)

- **Path**: `settings.frequencyPenalty`
- **Type**: `float`
- **Min**: `0`
- **Max**: `2`
- **Step**: `0.01`
- **Default**: `0`

Penalizes tokens based on their frequency in the output so far. A value of 0.0 disables the penalty.

### [maxTokens](https://runware.ai/docs/models/google-gemma-4-31b#request-settings-maxtokens)

- **Path**: `settings.maxTokens`
- **Type**: `integer`
- **Min**: `1`

Maximum number of tokens to generate in the response.

### [minP](https://runware.ai/docs/models/google-gemma-4-31b#request-settings-minp)

- **Path**: `settings.minP`
- **Type**: `float`
- **Min**: `0`
- **Max**: `1`
- **Step**: `0.01`
- **Default**: `0`

Minimum probability threshold. Tokens with probability below this value are excluded from sampling.

### [presencePenalty](https://runware.ai/docs/models/google-gemma-4-31b#request-settings-presencepenalty)

- **Path**: `settings.presencePenalty`
- **Type**: `float`
- **Min**: `-2`
- **Max**: `2`
- **Step**: `0.01`
- **Default**: `0`

Encourages the model to introduce new topics. A value of 0.0 disables the penalty.

### [repetitionPenalty](https://runware.ai/docs/models/google-gemma-4-31b#request-settings-repetitionpenalty)

- **Path**: `settings.repetitionPenalty`
- **Type**: `float`
- **Min**: `0`
- **Max**: `2`
- **Step**: `0.01`
- **Default**: `1`

Penalizes tokens that have already appeared in the output. A value of 1.0 disables the penalty.

### [stopSequences](https://runware.ai/docs/models/google-gemma-4-31b#request-settings-stopsequences)

- **Path**: `settings.stopSequences`
- **Type**: `array of strings`
- **Min**: `1`

Array of sequences that will cause the model to stop generating further tokens when encountered.

### [thinkingLevel](https://runware.ai/docs/models/google-gemma-4-31b#request-settings-thinkinglevel)

- **Path**: `settings.thinkingLevel`
- **Type**: `string`
- **Default**: `high`

Controls the depth of internal reasoning the model performs before generating a response.

**Allowed values**: `off` `high`

### [topK](https://runware.ai/docs/models/google-gemma-4-31b#request-settings-topk)

- **Path**: `settings.topK`
- **Type**: `integer`
- **Min**: `1`
- **Max**: `100`

Top-K sampling parameter that limits the number of highest-probability tokens considered at each step.

## Response Parameters

### [taskType](https://runware.ai/docs/models/google-gemma-4-31b#response-tasktype)

- **Type**: `string`
- **Required**: true
- **Value**: `textInference`

Type of the task.

### [taskUUID](https://runware.ai/docs/models/google-gemma-4-31b#response-taskuuid)

- **Type**: `string`
- **Required**: true
- **Format**: `UUID v4`

UUID of the task.

### [text](https://runware.ai/docs/models/google-gemma-4-31b#response-text)

- **Type**: `string`
- **Required**: true

Generated text content.

### [cost](https://runware.ai/docs/models/google-gemma-4-31b#response-cost)

- **Type**: `float`

Task cost in USD. Present when `includeCost` is set to `true` in the request.

### [finishReason](https://runware.ai/docs/models/google-gemma-4-31b#response-finishreason)

- **Type**: `string`
- **Required**: true

The reason why the model stopped generating tokens.

**Possible values**: `stop` `length` `content_filter` `unknown`

### [usage](https://runware.ai/docs/models/google-gemma-4-31b#response-usage)

- **Path**: `usage.promptTokens`
- **Type**: `object (4 properties)`
- **Required**: true

Token usage statistics for the request.

#### [promptTokens](https://runware.ai/docs/models/google-gemma-4-31b#response-usage-prompttokens)

- **Path**: `usage.promptTokens`
- **Type**: `integer`
- **Required**: true
- **Min**: `0`

Number of tokens in the input prompt.

#### [completionTokens](https://runware.ai/docs/models/google-gemma-4-31b#response-usage-completiontokens)

- **Path**: `usage.completionTokens`
- **Type**: `integer`
- **Required**: true
- **Min**: `0`

Number of tokens generated in the response.

#### [totalTokens](https://runware.ai/docs/models/google-gemma-4-31b#response-usage-totaltokens)

- **Path**: `usage.totalTokens`
- **Type**: `integer`
- **Required**: true
- **Min**: `0`

Total number of tokens used (prompt + completion).

#### [thinkingTokens](https://runware.ai/docs/models/google-gemma-4-31b#response-usage-thinkingtokens)

- **Path**: `usage.thinkingTokens`
- **Type**: `integer`
- **Min**: `0`

Number of tokens used for internal reasoning. Billed separately.