OpenAI Compatibility

Use Runware's text inference through any OpenAI-compatible client. Drop-in replacement with the same endpoint format, streaming, and SDK support.

Introduction

Runware exposes a fully OpenAI-compatible chat completions endpoint at https://api.runware.ai/v1/chat/completions. If you already use the OpenAI SDK or any tool that speaks the OpenAI protocol, you can point it at Runware by changing two values: the base URL and the API key.

This is the fastest way to get started with Runware's text inference. There is no Runware-specific parsing required, so any OpenAI-compatible client or framework will work out of the box.

If you need access to Runware-specific features like taskUUID tracking, includeCost, or the async delivery method, use the native API instead. The OpenAI-compatible endpoint focuses on broad compatibility with the standard OpenAI request and response format.

Quick start

Send a standard chat completion request to https://api.runware.ai/v1/chat/completions using your Runware API key and a Runware model ID:

{
  "model": "minimax:m2.7@0",
  "messages": [
    { "role": "user", "content": "What is the capital of France?" }
  ],
  "max_completion_tokens": 256
}

The request format is identical to the OpenAI Chat Completions API. The only difference is that model uses a Runware AIR ID (e.g., minimax:m2.7@0, google:gemini@3.1-pro) instead of an OpenAI model name.

curl

curl -X POST https://api.runware.ai/v1/chat/completions \
  -H "Authorization: Bearer $RUNWARE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimax:m2.7@0",
    "messages": [{"role": "user", "content": "What is the capital of France?"}],
    "max_completion_tokens": 256
  }'

Python

from openai import OpenAI

client = OpenAI(
    api_key="your_runware_api_key",
    base_url="https://api.runware.ai/v1",
)

response = client.chat.completions.create(
    model="minimax:m2.7@0",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
    max_completion_tokens=256,
)

print(response.choices[0].message.content)

TypeScript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'your_runware_api_key',
  baseURL: 'https://api.runware.ai/v1',
});

const response = await client.chat.completions.create({
  model: 'minimax:m2.7@0',
  messages: [{ role: 'user', content: 'What is the capital of France?' }],
  max_completion_tokens: 256,
});

console.log(response.choices[0].message.content);

Streaming

Set "stream": true to receive tokens as they are generated, exactly like you would with the OpenAI API. To include token counts at the end of the stream, add stream_options:

{
  "model": "minimax:m2.7@0",
  "stream": true,
  "stream_options": { "include_usage": true },
  "messages": [
    { "role": "user", "content": "Tell me a joke" }
  ],
  "max_completion_tokens": 512
}

The SSE response format is identical to OpenAI's. Content arrives in choices[0].delta.content, and the stream ends with data: [DONE].

Streaming with OpenAI SDKs

The OpenAI Python and TypeScript SDKs handle all SSE parsing for you:

Python

from openai import OpenAI

client = OpenAI(
    api_key="your_runware_api_key",
    base_url="https://api.runware.ai/v1",
)

with client.chat.completions.stream(
    model="minimax:m2.7@0",
    messages=[{"role": "user", "content": "Tell me a joke"}],
    max_tokens=512,
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

TypeScript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'your_runware_api_key',
  baseURL: 'https://api.runware.ai/v1',
});

const stream = await client.chat.completions.create({
  model: 'minimax:m2.7@0',
  messages: [{ role: 'user', content: 'Tell me a joke' }],
  max_tokens: 512,
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) process.stdout.write(content);
}

curl

curl -N -X POST https://api.runware.ai/v1/chat/completions \
  -H "Authorization: Bearer $RUNWARE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimax:m2.7@0",
    "stream": true,
    "stream_options": {"include_usage": true},
    "messages": [{"role": "user", "content": "Tell me a joke"}],
    "max_completion_tokens": 512
  }'

Reasoning models

Models that support internal reasoning (like MiniMax M2.7) stream reasoning tokens in choices[0].delta.reasoning_content before the final response appears in choices[0].delta.content. This follows the same pattern as OpenAI's reasoning models.

// First chunk - role assignment
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","model":"minimax:m2.7@0","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

// Reasoning chunks - choices[0].delta.reasoning_content
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"reasoning_content":"The user asks 2+2. Simple arithmetic."},"finish_reason":null}]}

// Actual response - switches to choices[0].delta.content
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"4"},"finish_reason":null}]}

// Usage chunk (when stream_options.include_usage is set)
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","choices":[],"usage":{"prompt_tokens":51,"completion_tokens":38,"total_tokens":89,"cost":0.000134}}

// Final chunk - finish reason
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Key differences from OpenAI

While the endpoint is fully compatible with the OpenAI protocol, there are a few things to be aware of:

Aspect	OpenAI	Runware
Model IDs	`gpt-4o`, `gpt-4o-mini`	AIR format: `minimax:m2.7@0`, `google:gemini@3.1-pro`
Authentication	OpenAI API key	Runware API key (same `Authorization: Bearer` header)
Base URL	`https://api.openai.com/v1`	`https://api.runware.ai/v1`
Available models	OpenAI models only	Multiple providers: MiniMax, Google Gemini, and more
Naming convention	snake_case (`finish_reason`, `reasoning_content`)	snake_case (same as OpenAI on this endpoint)

This endpoint uses snake_case field names (finish_reason, max_tokens) to match the OpenAI convention. The native API uses camelCase (finishReason, maxTokens).

Key differences from the native API

If you are choosing between the OpenAI-compatible endpoint and the native Runware API, here is how they compare:

	Native API	OpenAI-compatible
Endpoint	`https://api.runware.ai/v1`	`https://api.runware.ai/v1/chat/completions`
Request format	Array of task objects with `taskType` and `taskUUID`	Single object (standard OpenAI format)
Streaming trigger	`"deliveryMethod": "stream"`	`"stream": true`
Async support	Yes (`"deliveryMethod": "async"` with webhooks/polling)	No
Cost tracking	`includeCost: true` on the request	Included in the `usage` chunk when `stream_options.include_usage` is set
Usage tracking	`includeUsage: true` on the request	`stream_options: { "include_usage": true }`
Task tracking	`taskUUID` for request/response correlation	Standard `id` field on response chunks
Naming convention	camelCase	snake_case

Use the OpenAI-compatible endpoint when you want the fastest integration path or are already working with the OpenAI SDK.

Use the native API when you want a consistent request format across all Runware modalities (images, video, audio, text, 3d) and access to features like async delivery and task-level tracking.