OpenAI Compatibility
Use Runware's text inference through any OpenAI-compatible client. Drop-in replacement with the same endpoint format, streaming, and SDK support.
Introduction
Runware exposes a fully OpenAI-compatible chat completions endpoint at https://api.runware.ai/v1/chat/completions. If you already use the OpenAI SDK or any tool that speaks the OpenAI protocol, you can point it at Runware by changing two values: the base URL and the API key.
This is the fastest way to get started with Runware's text inference. There is no Runware-specific parsing required, so any OpenAI-compatible client or framework will work out of the box.
If you need access to Runware-specific features like taskUUID tracking, includeCost, or the async delivery method, use the native API instead. The OpenAI-compatible endpoint focuses on broad compatibility with the standard OpenAI request and response format.
Quick start
Send a standard chat completion request to https://api.runware.ai/v1/chat/completions using your Runware API key and a Runware model ID:
{
"model": "minimax:m2.7@0",
"messages": [
{ "role": "user", "content": "What is the capital of France?" }
],
"max_completion_tokens": 256
}The request format is identical to the OpenAI Chat Completions API . The only difference is that model uses a Runware AIR ID (e.g., minimax:m2.7@0, google:gemini@3.1-pro) instead of an OpenAI model name.
curl -X POST https://api.runware.ai/v1/chat/completions \
-H "Authorization: Bearer $RUNWARE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "minimax:m2.7@0",
"messages": [{"role": "user", "content": "What is the capital of France?"}],
"max_completion_tokens": 256
}'from openai import OpenAI
client = OpenAI(
api_key="your_runware_api_key",
base_url="https://api.runware.ai/v1",
)
response = client.chat.completions.create(
model="minimax:m2.7@0",
messages=[{"role": "user", "content": "What is the capital of France?"}],
max_completion_tokens=256,
)
print(response.choices[0].message.content)import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'your_runware_api_key',
baseURL: 'https://api.runware.ai/v1',
});
const response = await client.chat.completions.create({
model: 'minimax:m2.7@0',
messages: [{ role: 'user', content: 'What is the capital of France?' }],
max_completion_tokens: 256,
});
console.log(response.choices[0].message.content);Streaming
Set "stream": true to receive tokens as they are generated, exactly like you would with the OpenAI API. To include token counts at the end of the stream, add stream_options:
{
"model": "minimax:m2.7@0",
"stream": true,
"stream_options": { "include_usage": true },
"messages": [
{ "role": "user", "content": "Tell me a joke" }
],
"max_completion_tokens": 512
}The SSE response format is identical to OpenAI's. Content arrives in choices[0].delta.content, and the stream ends with data: [DONE].
Streaming with OpenAI SDKs
The OpenAI Python and TypeScript SDKs handle all SSE parsing for you:
from openai import OpenAI
client = OpenAI(
api_key="your_runware_api_key",
base_url="https://api.runware.ai/v1",
)
with client.chat.completions.stream(
model="minimax:m2.7@0",
messages=[{"role": "user", "content": "Tell me a joke"}],
max_tokens=512,
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'your_runware_api_key',
baseURL: 'https://api.runware.ai/v1',
});
const stream = await client.chat.completions.create({
model: 'minimax:m2.7@0',
messages: [{ role: 'user', content: 'Tell me a joke' }],
max_tokens: 512,
stream: true,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content;
if (content) process.stdout.write(content);
}curl -N -X POST https://api.runware.ai/v1/chat/completions \
-H "Authorization: Bearer $RUNWARE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "minimax:m2.7@0",
"stream": true,
"stream_options": {"include_usage": true},
"messages": [{"role": "user", "content": "Tell me a joke"}],
"max_completion_tokens": 512
}'Reasoning models
Models that support internal reasoning (like MiniMax M2.7) stream reasoning tokens in choices[0].delta.reasoning_content before the final response appears in choices[0].delta.content. This follows the same pattern as OpenAI's reasoning models.
// First chunk - role assignment
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","model":"minimax:m2.7@0","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
// Reasoning chunks - choices[0].delta.reasoning_content
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"reasoning_content":"The user asks 2+2. Simple arithmetic."},"finish_reason":null}]}
// Actual response - switches to choices[0].delta.content
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"4"},"finish_reason":null}]}
// Usage chunk (when stream_options.include_usage is set)
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","choices":[],"usage":{"prompt_tokens":51,"completion_tokens":38,"total_tokens":89,"cost":0.000134}}
// Final chunk - finish reason
data: {"id":"chatcmpl-e36b09e6-7a1c-4d8f-b5e2-9c4a3f6d8e1b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]Key differences from OpenAI
While the endpoint is fully compatible with the OpenAI protocol, there are a few things to be aware of:
| Aspect | OpenAI | Runware |
|---|---|---|
| Model IDs | gpt-4o, gpt-4o-mini | AIR format: minimax:m2.7@0, google:gemini@3.1-pro |
| Authentication | OpenAI API key | Runware API key (same Authorization: Bearer header) |
| Base URL | https://api.openai.com/v1 | https://api.runware.ai/v1 |
| Available models | OpenAI models only | Multiple providers: MiniMax, Google Gemini, and more |
| Naming convention | snake_case (finish_reason, reasoning_content) | snake_case (same as OpenAI on this endpoint) |
This endpoint uses snake_case field names (finish_reason, max_tokens) to match the OpenAI convention. The native API uses camelCase (finishReason, maxTokens).
Key differences from the native API
If you are choosing between the OpenAI-compatible endpoint and the native Runware API, here is how they compare:
| Native API | OpenAI-compatible | |
|---|---|---|
| Endpoint | https://api.runware.ai/v1 | https://api.runware.ai/v1/chat/completions |
| Request format | Array of task objects with taskType and taskUUID | Single object (standard OpenAI format) |
| Streaming trigger | "deliveryMethod": "stream" | "stream": true |
| Async support | Yes ("deliveryMethod": "async" with webhooks/polling) | No |
| Cost tracking | includeCost: true on the request | Included in the usage chunk when stream_options.include_usage is set |
| Usage tracking | includeUsage: true on the request | stream_options: { "include_usage": true } |
| Task tracking | taskUUID for request/response correlation | Standard id field on response chunks |
| Naming convention | camelCase | snake_case |
Use the OpenAI-compatible endpoint when you want the fastest integration path or are already working with the OpenAI SDK.
Use the native API when you want a consistent request format across all Runware modalities (images, video, audio, text, 3d) and access to features like async delivery and task-level tracking.