---
title: OpenAI Compatibility | Runware Docs
url: https://runware.ai/docs/platform/openai
description: Use Runware's text inference through any OpenAI-compatible client. Drop-in replacement with the same endpoint format, streaming, and SDK support.
relatedDocuments:
  - https://runware.ai/docs/platform/streaming
  - https://runware.ai/docs/platform/authentication
  - https://runware.ai/docs/platform/introduction
---
## Introduction

Runware exposes a fully **OpenAI-compatible chat completions endpoint** at `https://api.runware.ai/v1/chat/completions`. If you already use the OpenAI SDK or any tool that speaks the OpenAI protocol, you can point it at Runware by changing two values: the **base URL** and the **API key**.

This is the fastest way to get started with Runware's text inference. There is **no Runware-specific parsing required**, so any OpenAI-compatible client or framework will work out of the box.

> [!NOTE]
> If you need access to Runware-specific features like `taskUUID` tracking, `includeCost`, or the `async` delivery method, use the [native API](https://runware.ai/platform/streaming) instead. The OpenAI-compatible endpoint focuses on broad compatibility with the standard OpenAI request and response format.

## Quick start

Send a standard chat completion request to `https://api.runware.ai/v1/chat/completions` using your Runware API key and a Runware model ID:

```json
{
  "model": "minimax:m2.7@0",
  "messages": [
    { "role": "user", "content": "What is the capital of France?" }
  ],
  "max_completion_tokens": 256
}
```

The request format is identical to the [OpenAI Chat Completions API](https://platform.openai.com/docs/api-reference/chat/create). The only difference is that `model` uses a **Runware AIR ID** (e.g., `minimax:m2.7@0`, `google:gemini@3.1-pro`) instead of an OpenAI model name.

**curl**:

```bash
curl -X POST https://api.runware.ai/v1/chat/completions \
  -H "Authorization: Bearer $RUNWARE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimax:m2.7@0",
    "messages": [{"role": "user", "content": "What is the capital of France?"}],
    "max_completion_tokens": 256
  }'
```

**Python**:

```python
from openai import OpenAI

client = OpenAI(
    api_key="your_runware_api_key",
    base_url="https://api.runware.ai/v1",
)

response = client.chat.completions.create(
    model="minimax:m2.7@0",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
    max_completion_tokens=256,
)

print(response.choices[0].message.content)
```

**TypeScript**:

```typescript
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'your_runware_api_key',
  baseURL: 'https://api.runware.ai/v1',
});

const response = await client.chat.completions.create({
  model: 'minimax:m2.7@0',
  messages: [{ role: 'user', content: 'What is the capital of France?' }],
  max_completion_tokens: 256,
});

console.log(response.choices[0].message.content);
```

## Streaming

Set `"stream": true` to receive tokens as they are generated, exactly like you would with the OpenAI API. To include token counts at the end of the stream, add `stream_options`:

```json
{
  "model": "minimax:m2.7@0",
  "stream": true,
  "stream_options": { "include_usage": true },
  "messages": [
    { "role": "user", "content": "Tell me a joke" }
  ],
  "max_completion_tokens": 512
}
```

The SSE response format is **identical to OpenAI's**. Content arrives in `choices[0].delta.content`, and the stream ends with `data: [DONE]`.

### Streaming with OpenAI SDKs

The OpenAI Python and TypeScript SDKs handle all SSE parsing for you:

**Python**:

```python
from openai import OpenAI

client = OpenAI(
    api_key="your_runware_api_key",
    base_url="https://api.runware.ai/v1",
)

with client.chat.completions.stream(
    model="minimax:m2.7@0",
    messages=[{"role": "user", "content": "Tell me a joke"}],
    max_tokens=512,
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
```

**TypeScript**:

```typescript
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'your_runware_api_key',
  baseURL: 'https://api.runware.ai/v1',
});

const stream = await client.chat.completions.create({
  model: 'minimax:m2.7@0',
  messages: [{ role: 'user', content: 'Tell me a joke' }],
  max_tokens: 512,
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) process.stdout.write(content);
}
```

**curl**:

```bash
curl -N -X POST https://api.runware.ai/v1/chat/completions \
  -H "Authorization: Bearer $RUNWARE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimax:m2.7@0",
    "stream": true,
    "stream_options": {"include_usage": true},
    "messages": [{"role": "user", "content": "Tell me a joke"}],
    "max_completion_tokens": 512
  }'
```

### Reasoning models

Models that support internal reasoning (like MiniMax M2.7) stream reasoning tokens in `choices[0].delta.reasoning_content` before the final response appears in `choices[0].delta.content`. This follows the same pattern as OpenAI's reasoning models.

```text
// First chunk - role assignment
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","model":"minimax:m2.7@0","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

// Reasoning chunks - choices[0].delta.reasoning_content
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"reasoning_content":"The user asks 2+2. Simple arithmetic."},"finish_reason":null}]}

// Actual response - switches to choices[0].delta.content
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"4"},"finish_reason":null}]}

// Usage chunk (when stream_options.include_usage is set)
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","choices":[],"usage":{"prompt_tokens":51,"completion_tokens":38,"total_tokens":89,"cost":0.000134}}

// Final chunk - finish reason
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]
```

## Key differences from OpenAI

While the endpoint is fully compatible with the OpenAI protocol, there are a few things to be aware of:

| Aspect | OpenAI | Runware |
| --- | --- | --- |
| **Model IDs** | `gpt-4o`, `gpt-4o-mini` | AIR format: `minimax:m2.7@0`, `google:gemini@3.1-pro` |
| **Authentication** | OpenAI API key | Runware API key (same `Authorization: Bearer` header) |
| **Base URL** | `https://api.openai.com/v1` | `https://api.runware.ai/v1` |
| **Available models** | OpenAI models only | Multiple providers: MiniMax, Google Gemini, and more |
| **Naming convention** | snake\_case (`finish_reason`, `reasoning_content`) | snake\_case (same as OpenAI on this endpoint) |

> [!NOTE]
> This endpoint uses **snake\_case** field names (`finish_reason`, `max_tokens`) to match the OpenAI convention. The [native API](https://runware.ai/platform/streaming) uses **camelCase** (`finishReason`, `maxTokens`).

## Key differences from the native API

If you are choosing between the OpenAI-compatible endpoint and the native Runware API, here is how they compare:

|  | Native API | OpenAI-compatible |
| --- | --- | --- |
| **Endpoint** | `https://api.runware.ai/v1` | `https://api.runware.ai/v1/chat/completions` |
| **Request format** | Array of task objects with `taskType` and `taskUUID` | Single object (standard OpenAI format) |
| **Streaming trigger** | `"deliveryMethod": "stream"` | `"stream": true` |
| **Async support** | Yes (`"deliveryMethod": "async"` with webhooks/polling) | No |
| **Cost tracking** | `includeCost: true` on the request | Included in the `usage` chunk when `stream_options.include_usage` is set |
| **Usage tracking** | `includeUsage: true` on the request | `stream_options: { "include_usage": true }` |
| **Task tracking** | `taskUUID` for request/response correlation | Standard `id` field on response chunks |
| **Naming convention** | camelCase | snake\_case |

Use the **OpenAI-compatible endpoint** when you want the fastest integration path or are already working with the OpenAI SDK.

Use the **native API** when you want a consistent request format across all Runware modalities (images, video, audio, text, 3d) and access to features like async delivery and task-level tracking.