Image Inference API

Generate images from text prompts or transform existing ones using Runware's API. Learn how to do image inference for creative and high-quality results.

Introduction

Image inference is a powerful feature that allows you to generate images from text prompts or transform existing images according to your needs. This page is the complete API reference for image inference tasks. All workflows and operations use the single imageInference task type, differentiated through parameter combinations.

Core operations

Text-to-image: Generate images from text descriptions (full guide).
Image-to-image: Transform existing images based on prompts (full guide).
Inpainting: Edit specific areas within images (full guide).
Outpainting: Extend images beyond original boundaries (full guide).

Advanced Features

Additional parameters enable specialized capabilities:

Style and control: ControlNet, LoRA, IP-Adapters, Embeddings.
Quality enhancement: Refiners, VAE.
Identity: PuLID, ACE++, PhotoMaker.
Performance and other: Accelerator options, Advanced features.

Each feature includes detailed parameter documentation below.

Request

Our API always accepts an array of objects as input, where each object represents a specific task to be performed. The structure varies depending on the workflow and features used.

The following examples demonstrate how different parameter combinations create specific workflows.

Text to Image

{
  "taskType": "imageInference",
  "taskUUID": "a770f077-f413-47de-9dac-be0b26a35da6",
  "outputType": "URL",
  "outputFormat": "JPG",
  "positivePrompt": "a serene mountain landscape with a crystal-clear lake reflecting the sky",
  "height": 1024,
  "width": 1024,
  "model": "runware:101@1",
  "steps": 30,
  "CFGScale": 7.5,
  "numberResults": 4
}

Image to Image

{
  "taskType": "imageInference",
  "taskUUID": "b8c4d952-7f27-4a6e-bc9a-83f01d1c6d59",
  "positivePrompt": "a watercolor painting style, soft brushstrokes, artistic interpretation",
  "seedImage": "c64351d5-4c59-42f7-95e1-eace013eddab",
  "model": "civitai:139562@297320",
  "height": 1024,
  "width": 1024,
  "strength": 0.7,
  "numberResults": 1
}

Inpainting

{
  "taskType": "imageInference",
  "taskUUID": "f3a2b8c9-1e47-4d3a-9b2f-8c7e6d5a4b3c",
  "positivePrompt": "a red leather sofa, modern furniture, well-lit room",
  "seedImage": "c64351d5-4c59-42f7-95e1-eace013eddab",
  "maskImage": "d7e8f9a0-2b5c-4e7f-a1d3-9c8b7a6e5d4f",
  "model": "civitai:139562@297320",
  "height": 1024,
  "width": 1024,
  "strength": 0.8,
  "numberResults": 1
}

Outpainting

{
  "taskType": "imageInference",
  "taskUUID": "e4d3c2b1-5a6f-4c8e-b2d7-1f0e9d8c7b6a",
  "positivePrompt": "forest",
  "seedImage": "c64351d5-4c59-42f7-95e1-eace013eddab",
  "outpaint": { 
    "top": 128,
    "bottom": 128,
    "left": 64,
    "right": 64
  },
  "model": "civitai:139562@297320",
  "height": 1024,
  "width": 1152,
  "numberResults": 1
}

Refiner

{
  "taskType": "imageInference",
  "taskUUID": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "positivePrompt": "a highly detailed portrait of a wise old wizard with a long beard",
  "model": "civitai:139562@297320",
  "refiner": { 
    "model": "civitai:101055@128080",
    "startStep": 30
  },
  "height": 1024,
  "width": 1024,
  "steps": 35,
  "CFGScale": 8.0,
  "numberResults": 1
}

Embeddings

{
  "taskType": "imageInference",
  "taskUUID": "9876543210-abcd-ef12-3456-789012345678",
  "positivePrompt": "a fantasy castle in the MyStyle aesthetic, dramatic lighting, epiCPhoto",
  "model": "civitai:25694@143906",
  "height": 512,
  "width": 512,
  "embeddings": [ 
    { 
      "model": "civitai:195911@220262",
      "weight": 0.8
    }
  ],
  "numberResults": 1
}

ControlNet

{
  "taskType": "imageInference",
  "taskUUID": "12345678-9abc-def0-1234-56789abcdef0",
  "positivePrompt": "a photorealistic portrait of a young woman, professional lighting",
  "model": "runware:101@1",
  "height": 1024,
  "width": 1024,
  "controlNet": [ 
    { 
      "model": "runware:25@1",
      "guideImage": "9d7271cb-1be3-4607-88af-d039d771e5aa",
      "weight": 0.8,
      "startStep": 0,
      "endStep": 10,
      "controlMode": "balanced"
    }
  ],
  "numberResults": 1
}

LoRA

{
  "taskType": "imageInference",
  "taskUUID": "fedcba09-8765-4321-0fed-cba987654321",
  "positivePrompt": "a steampunk airship flying through cloudy skies, Victorian aesthetic",
  "model": "runware:101@1",
  "height": 1024,
  "width": 1024,
  "lora": [ 
    { 
      "model": "civitai:652699@993999",
      "weight": 0.95
    }
  ],
  "numberResults": 1
}

IPAdapters

{
  "taskType": "imageInference",
  "taskUUID": "abcdef12-3456-7890-abcd-ef1234567890",
  "positivePrompt": "__BLANK__",
  "model": "runware:101@1",
  "height": 1024,
  "width": 1024,
  "ipAdapters": [ 
    { 
      "model": "runware:105@1",
      "guideImage": "6963a97e-f017-408b-a447-6345ec31a4f0"
    }
  ],
  "numberResults": 1
}

taskType string required: The type of task to be performed. For this task, the value should be imageInference.

taskUUID string required UUID v4

When a task is sent to the API you must include a random UUID v4 string using the taskUUID parameter. This string is used to match the async responses to their corresponding tasks.

If you send multiple tasks at the same time, the taskUUID will help you match the responses to the correct tasks.

The taskUUID must be unique for each task you send to the API.

outputType "base64Data" | "dataURI" | "URL" Default: URL

Specifies the output type in which the image is returned. Supported values are: dataURI, URL, and base64Data.

base64Data: The image is returned as a base64-encoded string using the imageBase64Data parameter in the response object.
dataURI: The image is returned as a data URI string using the imageDataURI parameter in the response object.
URL: The image is returned as a URL string using the imageURL parameter in the response object.

outputFormat "JPG" | "PNG" | "WEBP" Default: JPG: Specifies the format of the output image. Supported formats are: PNG, JPG and WEBP.

outputQuality integer Min: 20 Max: 99 Default: 95: Sets the compression quality of the output image. Higher values preserve more quality but increase file size, lower values reduce file size but decrease quality.

uploadEndpoint string

Specifies a URL where the generated content will be automatically uploaded using the HTTP PUT method. The raw binary data of the image or video file is sent directly as the request body (not as JSON), enabling webhook-like functionality for immediate processing once generation is complete.

Common use cases:

Cloud storage: Upload directly to S3 buckets, Google Cloud Storage, or Azure Blob Storage.
Webhook services: Send content to your API endpoints or serverless functions for processing.
CDN integration: Upload to content delivery networks for immediate distribution.

URL customization:

You can include query parameters in the URL to help identify and process the uploaded content:

# Using our task UUIDs
https://your-api.com/webhook/media?taskUUID=991e641a-d2a8-4aa3-9883-9d6fe230fff8

// Using your own IDs
https://api.example.com/receive-content?id=img_abc123&projectId=proj_789

// Direct cloud storage upload
https://your-bucket.s3.amazonaws.com/generated/content.mp4

// Multiple parameters for tracking
https://api.example.com/process-media?userId=12345&timestamp=1640995200

The content data will be sent as the request body, allowing your endpoint to receive and process the generated image or video immediately upon completion.

checkNSFW boolean Default: false

This parameter is used to enable or disable the NSFW check. When enabled, the API will check if the image contains NSFW (not safe for work) content. This check is done using a pre-trained model that detects adult content in images.

When the check is enabled, the API will return NSFWContent: true in the response object if the image is flagged as potentially sensitive content. If the image is not flagged, the API will return NSFWContent: false.

If this parameter is not used, the parameter NSFWContent will not be included in the response object.

Adds 0.1 seconds to image inference time and incurs additional costs.

The NSFW filter occasionally returns false positives and very rarely false negatives.

includeCost boolean Default: false: If set to true, the cost to perform the task will be included in the response object.

positivePrompt string required

A positive prompt is a text instruction to guide the model on generating the image. It is usually a sentence or a paragraph that provides positive guidance for the task. This parameter is essential to shape the desired results.

For example, if the positive prompt is "dragon drinking coffee", the model will generate an image of a dragon drinking coffee. The more detailed the prompt, the more accurate the results.

If you wish to generate an image without any prompt guidance, you can use the special token __BLANK__. This tells the system to generate an image without text-based instructions.

The length of the prompt must be between 2 and 3000 characters.

Learn more ⁨2⁩ resources

negativePrompt string

A negative prompt is a text instruction to guide the model on generating the image. It is usually a sentence or a paragraph that provides negative guidance for the task. This parameter helps to avoid certain undesired results.

For example, if the negative prompt is "red dragon, cup", the model will follow the positive prompt but will avoid generating an image of a red dragon or including a cup. The more detailed the prompt, the more accurate the results.

The length of the prompt must be between 2 and 3000 characters.

Learn more ⁨1⁩ resource

Text to image: Turning words into pictures with AI
GUIDE

seedImage string required

When doing image-to-image, inpainting or outpainting, this parameter is required.

Specifies the seed image to be used for the diffusion process. The image can be specified in one of the following formats:

An UUID v4 string of a previously uploaded image or a generated image.
A data URI string representing the image. The data URI must be in the format data:<mediaType>;base64, followed by the base64-encoded image. For example: data:image/png;base64,iVBORw0KGgo....
A base64 encoded image without the data URI prefix. For example: iVBORw0KGgo....
A URL pointing to the image. The image must be accessible publicly.

Supported formats are: PNG, JPG and WEBP.

Learn more ⁨3⁩ resources

maskImage string required

When doing inpainting, this parameter is required.

Specifies the mask image to be used for the inpainting process. The image can be specified in one of the following formats:

An UUID v4 string of a previously uploaded image or a generated image.
A data URI string representing the image. The data URI must be in the format data:<mediaType>;base64, followed by the base64-encoded image. For example: data:image/png;base64,iVBORw0KGgo....
A base64 encoded image without the data URI prefix. For example: iVBORw0KGgo....
A URL pointing to the image. The image must be accessible publicly.

Supported formats are: PNG, JPG and WEBP.

Learn more ⁨1⁩ resource

Inpainting: Selective image editing
GUIDE

maskMargin integer Min: 32 Max: 128

Adds extra context pixels around the masked region during inpainting. When this parameter is present, the model will zoom into the masked area, considering these additional pixels to create more coherent and well-integrated details.

This parameter is particularly effective when used with masks generated by the Image Masking API, enabling enhanced detail generation while maintaining natural integration with the surrounding image.

Learn more ⁨1⁩ resource

Inpainting: Selective image editing
GUIDE

strength float Min: 0 Max: 1 Default: 0.8

When doing image-to-image or inpainting, this parameter is used to determine the influence of the seedImage image in the generated output. A lower value results in more influence from the original image, while a higher value allows more creative deviation.

Learn more ⁨3⁩ resources

referenceImages string[]

An array containing reference images used to condition the generation process. These images provide visual guidance to help the model generate content that aligns with the style, composition, or characteristics of the reference materials.

This parameter is particularly useful with edit models like FLUX.1 Kontext, where reference images can guide the generation toward specific visual attributes or maintain consistency with existing content. Each image can be specified in one of the following formats:

An UUID v4 string of a previously uploaded image or a generated image.
A data URI string representing the image. The data URI must be in the format data:<mediaType>;base64, followed by the base64-encoded image. For example: data:image/png;base64,iVBORw0KGgo....
A base64 encoded image without the data URI prefix. For example: iVBORw0KGgo....
A URL pointing to the image. The image must be accessible publicly.

Supported formats are: PNG, JPG and WEBP.

View model compatibility

Model Architecture	Max Images
FLUX.1 Kontext	2
Ace++	1
Other models	Not supported

View example

{
  "taskType": "imageInference",
  "taskUUID": "a770f077-f413-47de-9dac-be0b26a35da6",
  "positivePrompt": "the same person as a chef in a restaurant kitchen",
  "model": "runware:106@1",
  "width": 1024,
  "height": 1024,
  "referenceImages": [
    "bb5d8e32-2f85-4b9c-c1e4-9f6e20a5d3b8"
  ]
}

outpaint object

Extends the image boundaries in specified directions. When using outpaint, you must provide the final dimensions using width and height parameters, which should account for the original image size plus the total extension (seedImage dimensions + top + bottom, left + right).

View example

{
  "taskType": "imageInference",
  "taskUUID": "d06e972d-dbfe-47d5-955f-c26e00ce4959",
  "positivePrompt": "a beautiful landscape with mountains and trees",
  "negativePrompt": "blurry, bad quality",
  "seedImage": "59a2edc2-45e6-429f-be5f-7ded59b92046",
  "model": "civitai:4201@130090",
  "height": 1024,
  "width": 768,
  "steps": 20,
  "strength": 0.7,
  "outpaint": { 
    "top": 256,
    "right": 128,
    "bottom": 256,
    "left": 128,
    "blur": 16
  } 
}

Learn more ⁨1⁩ resource

Outpainting: Expanding image boundaries
GUIDE

Properties ⁨5⁩ properties

outpaint » top top integer Min: 0: Number of pixels to extend at the top of the image. Must be a multiple of 64.

outpaint » right right integer Min: 0: Number of pixels to extend at the right side of the image. Must be a multiple of 64.

outpaint » bottom bottom integer Min: 0: Number of pixels to extend at the bottom of the image. Must be a multiple of 64.

outpaint » left left integer Min: 0: Number of pixels to extend at the left side of the image. Must be a multiple of 64.

outpaint » blur blur integer Min: 0 Max: 32 Default: 0

The amount of blur to apply at the boundaries between the original image and the extended areas, measured in pixels.

Learn more ⁨1⁩ resource

Outpainting: Expanding image boundaries
GUIDE

height integer required Min: 128 Max: 2048

Used to define the height dimension of the generated image. Certain models perform better with specific dimensions.

The value must be divisible by 64, eg: 128...512, 576, 640...2048.

Learn more ⁨2⁩ resources

width integer required Min: 128 Max: 2048

Used to define the width dimension of the generated image. Certain models perform better with specific dimensions.

The value must be divisible by 64, eg: 128...512, 576, 640...2048.

Learn more ⁨2⁩ resources

model string required

We make use of the AIR (Artificial Intelligence Resource) system to identify models. This identifier is a unique string that represents a specific model.

You can find the AIR identifier of the model you want to use in our Model Explorer, which is a tool that allows you to search for models based on their characteristics.

Learn more ⁨3⁩ resources

Text to image: Turning words into pictures with AI
GUIDE
Inpainting: Selective image editing
GUIDE
Outpainting: Expanding image boundaries
GUIDE

vae string

We make use of the AIR (Artificial Intelligence Resource) system to identify VAE models. This identifier is a unique string that represents a specific model.

The VAE (Variational Autoencoder) can be specified to override the default one included with the base model, which can help improve the quality of generated images.

You can find the AIR identifier of the VAE model you want to use in our Model Explorer, which is a tool that allows you to search for models based on their characteristics.

Learn more ⁨1⁩ resource

Text to image: Turning words into pictures with AI
GUIDE

steps integer Min: 1 Max: 100 Default: 20

The number of steps is the number of iterations the model will perform to generate the image. The higher the number of steps, the more detailed the image will be. However, increasing the number of steps will also increase the time it takes to generate the image and may not always result in a better image (some schedulers work differently).

When using your own models you can specify a new default value for the number of steps.

Learn more ⁨1⁩ resource

Text to image: Turning words into pictures with AI
GUIDE

scheduler string Default: Model's scheduler

An scheduler is a component that manages the inference process. Different schedulers can be used to achieve different results like more detailed images, faster inference, or more accurate results.

The default scheduler is the one that the model was trained with, but you can choose a different one to get different results.

Schedulers are explained in more detail in the Schedulers page.

Learn more ⁨2⁩ resources

seed integer Min: 1 Max: 9223372036854776000 Default: Random

A seed is a value used to randomize the image generation. If you want to make images reproducible (generate the same image multiple times), you can use the same seed value.

When requesting multiple images with the same seed, the seed will be incremented by 1 (+1) for each image generated.

Note: Random seeds are generated as 32-bit values for platform compatibility, but you can specify any value if your platform supports it (JavaScript safely supports up to 53-bit integers).

Learn more ⁨1⁩ resource

Text to image: Turning words into pictures with AI
GUIDE

CFGScale float Min: 0 Max: 50 Default: 7

Guidance scale represents how closely the images will resemble the prompt or how much freedom the AI model has. Higher values are closer to the prompt. Low values may reduce the quality of the results.

Learn more ⁨1⁩ resource

Text to image: Turning words into pictures with AI
GUIDE

clipSkip integer Min: 0 Max: 2

Defines additional layer skips during prompt processing in the CLIP model. Some models already skip layers by default, this parameter adds extra skips on top of those. Different values affect how your prompt is interpreted, which can lead to variations in the generated image.

Learn more ⁨2⁩ resources

promptWeighting string

Defines the syntax to be used for prompt weighting.

Prompt weighting allows you to adjust how strongly different parts of your prompt influence the generated image. Choose between compel notation with advanced weighting operations or sdEmbeds for simple emphasis adjustments.

View Compel syntax

Adds 0.2 seconds to image inference time and incurs additional costs.

When compel syntax is selected, you can use the following notation in prompts:

Weighting

Syntax: + - (word)0.9

Increase or decrease the attention given to specific words or phrases.

Examples:

Single words: small+ dog, pixar style
Multiple words: small dog, (pixar style)-
Multiple symbols for more effect: small+++ dog, pixar style
Nested weighting: (small+ dog)++, pixar style
Explicit weight percentage: small dog, (pixar)1.2 style

Blend

Syntax: .blend()

Merge multiple conditioning prompts.

Example: ("small dog", "robot").blend(1, 0.8)

Conjunction

Syntax: .and()

Break a prompt into multiple clauses and pass them separately.

Example: ("small dog", "pixar style").and()

View sdEmbeds syntax

When sdEmbeds syntax is selected, you can use the following notation in prompts:

Weighting

Syntax: (text) (text:number) [text]

Use parentheses () to increase attention, square brackets [] to decrease it. Add a number after the text to specify a custom multiplier.

Examples:

Single words: (small) dog, pixar style
Multiple words: small dog, [pixar style]
Higher emphasis: (small:2.5) dog, pixar style
Combined emphasis: (small dog:1.5), pixar style

numberResults integer Min: 1 Max: 20 Default: 1

Specifies how many images to generate for the given parameters. Each image will have the same parameters but different seeds, resulting in variations of the same concept.

If seed is set, it will be incremented by 1 (+1) for each image generated.

advancedFeatures object

A container for specialized features that extend the functionality of the image generation process. This object groups advanced capabilities that enhance specific aspects of the generation pipeline.

Properties ⁨1⁩ property

advancedFeatures » layerDiffuse layerDiffuse boolean Default: false

Enables LayerDiffuse technology, which allows for the direct generation of images with transparency (alpha channels).

When enabled, this feature applies the necessary LoRA and VAE components to produce high-quality transparent images without requiring post-processing background removal.

This is particularly useful for creating product images, overlays, composites, and other content that requires transparency. The output must be in a format that supports transparency, such as PNG.

Note: This feature is only available for the FLUX model architecture. It automatically applies the equivalent of:


  "lora": [{ "model": "runware:120@2" }],
  "vae": "runware:120@4"

View example

{
  "taskType": "imageInference",
  "taskUUID": "991e641a-d2a8-4aa3-9883-9d6fe230fff8",
  "outputFormat": "PNG",
  "positivePrompt": "a crystal glass",
  "height": 1024,
  "width": 1024,
  "advancedFeatures": {
    "layerDiffuse": true
  },
  "model": "runware:101@1"
}

Learn more ⁨1⁩ resource

Introducing LayerDiffuse: Generate images with built-in transparency in one step
ARTICLE

acceleratorOptions object

Advanced caching mechanisms to significantly speed up image generation by reducing redundant computation. This object allows you to enable and configure acceleration technologies for your specific model architecture.

These caching methods will not perform well with stochastic schedulers (those with SDE or Ancestral in the name). The random noise added by these schedulers prevents the cache from working effectively. For best results, use deterministic schedulers like Euler or DDIM.

Properties ⁨5⁩ properties

acceleratorOptions » teaCache teaCache boolean Default: false

Enables or disables the TeaCache feature, which accelerates image generation by reusing past computations.

TeaCache is specifically designed for transformer-based models such as Flux and SD 3, and does not work with UNet models like SDXL or SD 1.5.

This feature is particularly effective for iterative image editing and prompt refinement workflows.

View example

"acceleratorOptions": {
  "teaCache": true,
  "teaCacheDistance": 0.6
}

acceleratorOptions » teaCacheDistance teaCacheDistance float Min: 0 Max: 1 Default: 0.5

Controls the aggressiveness of the TeaCache feature. Values range from 0.0 (most conservative) to 1.0 (most aggressive).

Lower values prioritize quality by being more selective about which computations to reuse, while higher values prioritize speed by reusing more computations.

Example: A value of 0.1 is very conservative, maintaining high quality with modest speed improvements, while 0.6 is more aggressive, yielding greater speed gains with potential minor quality trade-offs.

acceleratorOptions » deepCache deepCache boolean Default: false

Enables or disables the DeepCache feature, which speeds up diffusion-based image generation by caching internal feature maps from the neural network.

DeepCache is designed for UNet-based models like SDXL and SD 1.5, and is not applicable to transformer-based models like Flux and SD 3.

DeepCache can provide significant performance improvements for high-throughput scenarios or when generating multiple similar images.

View example

"acceleratorOptions": {
  "deepCache": true,
  "deepCacheInterval": 3,
  "deepCacheBranchId": 0
}

acceleratorOptions » deepCacheInterval deepCacheInterval integer Min: 1 Default: 3

Represents the frequency of feature caching, specified as the number of steps between each cache operation.

A larger interval value will make inference faster but may impact quality. A smaller interval prioritizes quality over speed.

acceleratorOptions » deepCacheBranchId deepCacheBranchId integer Min: 0 Default: 0

Determines which branch of the network (ordered from the shallowest to the deepest layer) is responsible for executing the caching processes.

Lower branch IDs (e.g., 0) result in more aggressive caching for faster generation, while higher branch IDs produce more conservative caching with potentially higher quality results.

puLID object

PuLID (Pure and Lightning ID Customization) enables fast and high-quality identity customization for text-to-image generation. This object allows you to configure settings for transferring facial characteristics from a reference image to generated images with high fidelity.

View example

{
  "taskType": "imageInference",
  "taskUUID": "991e641a-d2a8-4aa3-9883-9d6fe230fff8",
  "positivePrompt": "portrait, color, cinematic, in garden, soft light, detailed face",
  "height": 1024,
  "width": 1024,
  "model": "runware:101@1",
  "puLID": { 
    "inputImages": ["59a2edc2-45e6-429f-be5f-7ded59b92046"],
    "idWeight": 1,
    "trueCFGScale": 1.5,
    "CFGStartStep": 3
  } 
}

Properties ⁨5⁩ properties

puLID » inputImages inputImages string[] required Min: 1 Max: 1

An array containing the reference image used for identity customization. The reference image provides the facial characteristics that will be preserved and integrated into the generated images.

Currently, only a single image is supported, so the array should contain exactly one element with a clear, high-quality face that will serve as the identity source.

The image can be specified in one of the following formats:

An UUID v4 string of a previously uploaded image or a generated image.
A data URI string representing the image. The data URI must be in the format data:<mediaType>;base64, followed by the base64-encoded image. For example: data:image/png;base64,iVBORw0KGgo....
A base64 encoded image without the data URI prefix. For example: iVBORw0KGgo....
A URL pointing to the image. The image must be accessible publicly.

Supported formats are: PNG, JPG and WEBP.

puLID » idWeight idWeight integer Min: 0 Max: 3 Default: 1: Controls the strength of identity preservation in the generated image. Higher values create outputs that more closely resemble the facial characteristics of the input image, while lower values allow for more creative interpretation while still maintaining some identity features.

puLID » trueCFGScale trueCFGScale float Min: 0 Max: 10

Controls the guidance scale specifically for PuLID's identity embedding process. This parameter modifies how closely the generated image follows the identity characteristics from the reference image while balancing prompt adherence.

Higher values result in stronger identity preservation and more faithful reproduction of facial features from the reference image. Lower values allow for more creative interpretation while still maintaining recognizable identity features.

This parameter works in conjunction with the main CFGScale parameter but specifically targets the identity embedding component of the generation process.

puLID » CFGStartStep CFGStartStep integer Min: 0 Max: 10

Alternative parameters: puLID.startStepPercentageCFG.

Controls when identity features begin to influence the image generation process.

Lower values apply identity features earlier in the generation process, resulting in stronger resemblance to the reference face but with less creative freedom in composition and style. Higher values do the opposite.

For photorealistic images, starting as early as possible typically works best. For stylized images (cartoon, anime, etc.), starting a bit later can provide better results.

puLID » CFGStartStepPercentage CFGStartStepPercentage integer Min: 0 Max: 100

Alternative parameters: puLID.startStepCFG.

Determines at what percentage of the total generation steps the identity features begin to influence the image.

Lower percentages apply identity features earlier in the generation process, creating stronger resemblance to the reference face but with less creative freedom in composition and style. Higher percentages do the opposite.

For photorealistic images, starting as early as possible typically works best. For stylized images (cartoon, anime, etc.), starting a bit later can provide better results.

acePlusPlus object

ACE++ is an advanced framework for character-consistent image generation and editing. It supports two distinct workflows: creating new images guided by a reference image, and editing existing images with precise control over specific regions.

Note: When using the acePlusPlus object, you must set the model parameter to runware:102@1 (FLUX Fill).

The referenceImages parameter is required when using ACE++ and must be specified at the root level of the request, outside of the acePlusPlus object.

View examples

Creation Workflow: Generate new images that maintain the style, identity, or characteristics from a reference image. The model extracts visual features from the reference image and combines them with the text prompt to condition the generation process.

{
  "taskType": "imageInference",
  "taskUUID": "991e641a-d2a8-4aa3-9883-9d6fe230fff8",
  "positivePrompt": "photo of man wearing a suit",
  "height": 1024,
  "width": 1024,
  "model": "runware:102@1",
  "referenceImages": ["59a2edc2-45e6-429f-be5f-7ded59b92046"],
  "acePlusPlus": {
    "type": "portrait",
    "repaintingScale": 0.5
  }
}

Editing Workflow: Modify specific regions of an existing image using guidance from a reference image. Uses an input mask to define the exact area to be edited while preserving the rest of the image unchanged.

{
  "taskType": "imageInference",
  "taskUUID": "991e641a-d2a8-4aa3-9883-9d6fe230fff8",
  "positivePrompt": "photo of man wearing a white t-shirt",
  "height": 1024,
  "width": 1024,
  "model": "runware:102@1",
  "referenceImages": ["59a2edc2-45e6-429f-be5f-7ded59b92046"],
  "acePlusPlus": {
    "type": "local_editing",
    "inputImages": ["59a2edc2-45e6-429f-be5f-7ded59b92046"],
    "inputMasks": ["90422a52-f186-4bf4-a73b-0a46016a8330"],
    "repaintingScale": 0.7
  }
}

Properties ⁨4⁩ properties

acePlusPlus » type type string required Default: portrait

Specifies the nature of the image processing task, which determines the appropriate model configuration and LoRA weights to use within the ACE++ framework.

Available task types:

portrait: Ensures consistency in facial features across different images, maintaining identity and expression. Ideal for generating consistent character appearances in various settings.
subject: Maintains consistency of specific subjects (objects, logos, etc.) across different scenes or contexts. Perfect for placing logos consistently on various products or backgrounds.
local_editing: Facilitates localized editing of images, allowing modification of specific regions while preserving the overall structure. Used for targeted edits like changing object colors or altering facial features.

Each task type automatically applies the corresponding specialized LoRA model optimized for that specific use case.

acePlusPlus » inputImages inputImages string[] Max: 1

An array containing the reference image(s) used for character identity. Each input image must contain a single, clear face of the subject.

Currently, only a single image is supported, so the array should contain exactly one element.

This reference images provides the character identity (face, style, etc.) that will be preserved during generation or editing.

The images can be specified in one of the following formats:

An UUID v4 string of a previously uploaded image or a generated image.
A data URI string representing the image. The data URI must be in the format data:<mediaType>;base64, followed by the base64-encoded image. For example: data:image/png;base64,iVBORw0KGgo....
A base64 encoded image without the data URI prefix. For example: iVBORw0KGgo....
A URL pointing to the image. The image must be accessible publicly.

Supported formats are: PNG, JPG and WEBP.

acePlusPlus » inputMasks inputMasks string[] Max: 1

An array containing the mask image(s) used for selective editing.

Currently, only a single mask is supported, so if provided, the array should contain exactly one element.

This parameter is used only in editing operations. The mask specifies which areas of the image should be edited based on the prompt, while preserving the rest of the image. The mask image can be specified in the same formats as inputImages.

The mask should be a black and white image where white (255) represents the areas to be edited and black (0) represents the areas to be preserved.

The mask images can be specified in one of the following formats:

An UUID v4 string of a previously uploaded image or a generated image.
A data URI string representing the image. The data URI must be in the format data:<mediaType>;base64, followed by the base64-encoded image. For example: data:image/png;base64,iVBORw0KGgo....
A base64 encoded image without the data URI prefix. For example: iVBORw0KGgo....
A URL pointing to the image. The image must be accessible publicly.

Supported formats are: PNG, JPG and WEBP.

acePlusPlus » repaintingScale repaintingScale float Min: 0 Max: 1 Default: 0

Controls the balance between preserving the original character identity and following the prompt instructions.

A value of 0.0 gives maximum priority to character identity preservation, while a value of 1.0 gives maximum priority to following the prompt instructions.

For subtle changes while maintaining strong character resemblance, use lower values.

refiner object

Refiner models help create higher quality image outputs by incorporating specialized models designed to enhance image details and overall coherence. This can be particularly useful when you need results with superior quality, photorealism, or specific aesthetic refinements. Note that refiner models are only SDXL based.

The refiner parameter is an object that contains properties defining how the refinement process should be configured. You can find the properties of the refiner object below.

View example

{
  "taskType": "imageInference",
  "taskUUID": "a1b3c3d4-e5f6-7890-abcd-ef1234567890",
  "positivePrompt": "a highly detailed portrait of a wise old wizard with a long beard",
  "model": "civitai:139562@297320",
  "height": 1024,
  "width": 1024,
  "steps": 40,
  "refiner": { 
    "model": "civitai:101055@128080",
    "startStep": 30
  } 
}

Learn more ⁨1⁩ resource

Text to image: Turning words into pictures with AI
GUIDE

Properties ⁨3⁩ properties

refiner » model model string required

We make use of the AIR system to identify refiner models. This identifier is a unique string that represents a specific model. Note that refiner models are only SDXL based.

You can find the AIR identifier of the refiner model you want to use in our Model Explorer, which is a tool that allows you to search for models based on their characteristics.

The official SDXL refiner model is civitai:101055@128080.

More information about the AIR system can be found in the Models page.

refiner » startStep startStep integer Min: 2 Max: {steps}

Alternative parameters: refiner.startStepPercentage.

Represents the step number at which the refinement process begins. The initial model will generate the image up to this step, after which the refiner model takes over to enhance the result.

It can take values from 2 (second step) to the number of steps specified.

refiner » startStepPercentage startStepPercentage integer Min: 1 Max: 99

Alternative parameters: refiner.startStep.

Represents the percentage of total steps at which the refinement process begins. The initial model will generate the image up to this percentage of steps before the refiner takes over.

It can take values from 1 to 99.

embeddings object[]

Embeddings (or Textual Inversion) can be used to add specific concepts or styles to your generations. Multiple embeddings can be used at the same time.

The embeddings parameter is an array of objects. Each object contains properties that define which embedding model to use. You can find the properties of the embeddings object below.

View example

{
  "taskType": "imageInference",
  "taskUUID": "string",
  "positivePrompt": "string",
  "model": "string",
  "height": int,
  "width": int,
  "numberResults": int,
  "embeddings": [ 
    { 
      "model": "string",
    },
    { 
      "model": "string",
    } 
  ] 
}

Learn more ⁨1⁩ resource

Text to image: Turning words into pictures with AI
GUIDE

Array items ⁨2⁩ properties each

embeddings[] » model model string required

We make use of the AIR system to identify embeddings models. This identifier is a unique string that represents a specific model.

You can find the AIR identifier of the embeddings model you want to use in our Model Explorer, which is a tool that allows you to search for models based on their characteristics.

embeddings[] » weight weight float Min: -4 Max: 4 Default: 1

Defines the strength or influence of the embeddings model in the generation process. The value can range from -4 (negative influence) to +4 (maximum influence).

It is possible to use multiple embeddings at the same time.

Example:

"embeddings": [
  { "model": "civitai:1044536@1172007", "weight": 1.5 },
  { "model": "civitai:993446@1113094", "weight": 0.8 }
]

controlNet object[]

With ControlNet, you can provide a guide image to help the model generate images that align with the desired structure. This guide image can be generated with our ControlNet preprocessing tool, extracting guidance information from an input image. The guide image can be in the form of an edge map, a pose, a depth estimation or any other type of control image that guides the generation process via the ControlNet model.

Multiple ControlNet models can be used at the same time to provide different types of guidance information to the model.

The controlNet parameter is an array of objects. Each object contains properties that define the configuration for a specific ControlNet model. You can find the properties of the ControlNet object below.

View example

{
  "taskType": "imageInference",
  "taskUUID": "string",
  "positivePrompt": "string",
  "model": "string",
  "height": int,
  "width": int,
  "numberResults": int,
  "controlNet": [ 
    { 
      "model": "string",
      "guideImage": "string",
      "weight": float,
      "startStep": int,
      "endStep": int,
      "controlMode": "string"
    },
    { 
      "model": "string",
      "guideImage": "string",
      "weight": float,
      "startStep": int,
      "endStep": int,
      "controlMode": "string"
    } 
  ] 
}

Learn more ⁨2⁩ resources

Array items ⁨8⁩ properties each

controlNet[] » model model string required

For basic/common ControlNet models, you can check the list of available models here.

For custom or specific ControlNet models, we make use of the AIR system to identify ControlNet models. This identifier is a unique string that represents a specific model.

You can find the AIR identifier of the ControlNet model you want to use in our Model Explorer, which is a tool that allows you to search for models based on their characteristics.

More information about the AIR system can be found in the Models page.

controlNet[] » guideImage guideImage string required

Specifies the preprocessed image to be used as guide to control the image generation process. The image can be specified in one of the following formats:

An UUID v4 string of a previously uploaded image or a generated image.
A data URI string representing the image. The data URI must be in the format data:<mediaType>;base64, followed by the base64-encoded image. For example: data:image/png;base64,iVBORw0KGgo....
A base64 encoded image without the data URI prefix. For example: iVBORw0KGgo....
A URL pointing to the image. The image must be accessible publicly.

Supported formats are: PNG, JPG and WEBP.

controlNet[] » weight weight float Min: 0 Max: 1 Default: 1: Represents the strength or influence of this ControlNet model in the generation process. A value of 0 means no influence, while 1 means maximum influence.

controlNet[] » startStep startStep integer Min: 1 Max: {steps}

Alternative parameters: controlNet.startStepPercentage.

Represents the step number at which the ControlNet model starts to control the inference process.

It can take values from 1 (first step) to the number of steps specified.

Learn more ⁨1⁩ resource

Creating consistent gaming assets with ControlNet Canny
ARTICLE

controlNet[] » startStepPercentage startStepPercentage integer Min: 0 Max: 99

Alternative parameters: controlNet.startStep.

Represents the percentage of steps at which the ControlNet model starts to control the inference process.

It can take values from 0 to 99.

Learn more ⁨1⁩ resource

Creating consistent gaming assets with ControlNet Canny
ARTICLE

controlNet[] » endStep endStep integer Min: {startStep + 1} Max: {steps}

Alternative parameters: controlNet.endStepPercentage.

Represents the step number at which the ControlNet preprocessor ends to control the inference process.

It can take values higher than startStep and less than or equal to the number of steps specified.

Learn more ⁨1⁩ resource

Creating consistent gaming assets with ControlNet Canny
ARTICLE

controlNet[] » endStepPercentage endStepPercentage integer Min: {startStepPercentage + 1} Max: 100

Alternative parameters: controlNet.endStep.

Represents the percentage of steps at which the ControlNet model ends to control the inference process.

It can take values higher than startStepPercentage and lower than or equal to 100.

Learn more ⁨1⁩ resource

Creating consistent gaming assets with ControlNet Canny
ARTICLE

controlNet[] » controlMode controlMode string

This parameter has 3 options: prompt, controlnet and balanced.

prompt: Prompt is more important in guiding image generation.
controlnet: ControlNet is more important in guiding image generation.
balanced: Balanced operation of prompt and ControlNet.

lora object[]

With LoRA (Low-Rank Adaptation), you can adapt a model to specific styles or features by emphasizing particular aspects of the data. This technique enhances the quality and relevance of the generated images and can be especially useful in scenarios where the generated images need to adhere to a specific artistic style or follow particular guidelines.

Multiple LoRA models can be used at the same time to achieve different adaptation goals.

The lora parameter is an array of objects. Each object contains properties that define the configuration for a specific LoRA model. You can find the properties of the LoRA object below.

View example

{
  "taskType": "imageInference",
  "taskUUID": "string",
  "positivePrompt": "string",
  "model": "string",
  "height": int,
  "width": int,
  "numberResults": int,
  "lora": [ 
    { 
      "model": "string",
      "weight": float 
    },
    { 
      "model": "string",
      "weight": float 
    } 
  ] 
}

Learn more ⁨1⁩ resource

Text to image: Turning words into pictures with AI
GUIDE

Array items ⁨2⁩ properties each

lora[] » model model string required

We make use of the AIR system to identify LoRA models. This identifier is a unique string that represents a specific model.

You can find the AIR identifier of the LoRA model you want to use in our Model Explorer, which is a tool that allows you to search for models based on their characteristics.

More information about the AIR system can be found in the Models page.

Example: civitai:132942@146296.

lora[] » weight weight float Min: -4 Max: 4 Default: 1

Defines the strength or influence of the LoRA model in the generation process. The value can range from -4 (negative influence) to +4 (maximum influence).

It is possible to use multiple LoRAs at the same time.

View example

"lora": [
  { "model": "runware:13090@1", "weight": 1.5 },
  { "model": "runware:6638@1", "weight": 0.8 }
]

ipAdapters object[]

IP-Adapters enable image-prompted generation, allowing you to use reference images to guide the style and content of your generations. Multiple IP Adapters can be used simultaneously.

The ipAdapters parameter is an array of objects. Each object contains properties that define which IP-Adapter model to use and how it should influence the generation. You can find the properties of the IP-Adapter object below.

View example

{
  "taskType": "imageInference",
  "taskUUID": "string",
  "positivePrompt": "string",
  "model": "string",
  "height": int,
  "width": int,
  "numberResults": int,
  "ipAdapters": [ 
    { 
      "model": "string",
      "guideImage": "string",
      "weight": "float",
    },
    { 
      "model": "string",
      "guideImage": "string",
      "weight": "float",
    } 
  ] 
}

Learn more ⁨1⁩ resource

Image-to-image: The art of AI-powered image transformation
GUIDE

Array items ⁨3⁩ properties each

ipAdapters[] » model model string required

We make use of the AIR system to identify IP-Adapter models. This identifier is a unique string that represents a specific model.

Supported models list

AIR ID	Model Name
runware:55@1	IP Adapter SDXL
runware:55@2	IP Adapter SDXL Plus
runware:55@3	IP Adapter SDXL Plus Face
runware:55@4	IP Adapter SDXL Vit-H
runware:55@5	IP Adapter SD 1.5
runware:55@6	IP Adapter SD 1.5 Plus
runware:55@7	IP Adapter SD 1.5 Light
runware:55@8	IP Adapter SD 1.5 Plus Face
runware:55@10	IP Adapter SD 1.5 Vit-G

ipAdapters[] » guideImage guideImage string required

Specifies the reference image that will guide the generation process. The image can be specified in one of the following formats:

An UUID v4 string of a previously uploaded image or a generated image.
A data URI string representing the image. The data URI must be in the format data:<mediaType>;base64, followed by the base64-encoded image. For example: data:image/png;base64,iVBORw0KGgo....
A base64 encoded image without the data URI prefix. For example: iVBORw0KGgo....
A URL pointing to the image. The image must be accessible publicly.

Supported formats are: PNG, JPG and WEBP.

ipAdapters[] » weight weight float Min: 0 Max: 1 Default: 1: Represents the strength or influence of this IP-Adapter in the generation process. A value of 0 means no influence, while 1 means maximum influence.

providerSettings object

Contains provider-specific configuration settings that customize the behavior of different AI models and services. Each provider has its own set of parameters that control various aspects of the generation process.

Currently supported providers:

bfl: Settings for Black Forest Labs models.

The providerSettings parameter is an object that contains nested objects for each supported provider.

View example

{
  "taskType": "imageInference",
  "taskUUID": "991e641a-d2a8-4aa3-9883-9d6fe230fff8",
  "positivePrompt": "a beautiful landscape with mountains",
  "model": "bfl:2@2",
  "providerSettings": { 
    "bfl": { 
      "promptUpsampling": true,
      "safetyTolerance": 4,
      "raw": false
    } 
  } 
}

Properties ⁨3⁩ properties

providerSettings » bfl bfl object

Configuration object for Black Forest Labs (BFL) specific features. BFL models offer advanced prompt processing and content safety controls.

View example

{
  "taskType": "imageInference",
  "taskUUID": "a770f077-f413-47de-9dac-be0b26a35da6",
  "positivePrompt": "a beautiful landscape at sunset",
  "model": "bfl:1@1",
  "width": 1024,
  "height": 1024,
  "providerSettings": {
    "bfl": {
      "promptUpsampling": true,
      "safetyTolerance": 6
    }
  }
}

Properties ⁨3⁩ properties

providerSettings » bfl » promptUpsampling promptUpsampling boolean Default: false

Enables automatic enhancement and expansion of the input prompt to improve generation quality and detail.

When enabled, BFL's prompt upsampling system analyzes your text description and adds relevant details and descriptive elements that enhance the final output without changing the core intent of your prompt.

providerSettings » bfl » safetyTolerance safetyTolerance integer Min: 0 Max: 6 Default: 2: Controls the tolerance level for input and output content moderation. Lower values apply stricter content filtering, while higher values are more permissive. Range from 0 (most strict) to 6 (least strict).

providerSettings » bfl » raw raw boolean Default: false

Controls the level of post-processing applied to generated images.

When enabled, the raw mode produces images that are closer to the model's direct output without additional processing layers. This can result in more natural-looking images but may sacrifice some visual polish and consistency that post-processing typically provides.

Response

All inference operations return a consistent response format. Results arrive as they complete due to parallel processing.

{
  "data": [
    {
      "taskType": "imageInference",
      "taskUUID": "a770f077-f413-47de-9dac-be0b26a35da6",
      "imageUUID": "77da2d99-a6d3-44d9-b8c0-ae9fb06b6200",
      "imageURL": "https://im.runware.ai/image/ws/0.5/ii/a770f077-f413-47de-9dac-be0b26a35da6.jpg",
      "cost": 0.0013
    }
  ]
}

taskType string: The API will return the taskType you sent in the request. In this case, it will be imageInference. This helps match the responses to the correct task type.

taskUUID string UUID v4: The API will return the taskUUID you sent in the request. This way you can match the responses to the correct request tasks.

imageUUID string UUID v4

A unique identifier for the output image. This UUID can be used to reference the image in subsequent operations or for tracking purposes.

The imageUUID is different from the taskUUID. While taskUUID identifies the request, imageUUID identifies the specific image output.

imageURL string: If outputType is set to URL, this parameter contains the URL of the image to be downloaded.

imageBase64Data string: If outputType is set to base64Data, this parameter contains the base64-encoded image data.

imageDataURI string: If outputType is set to dataURI, this parameter contains the data URI of the image.

seed integer: The seed value that was used to generate this image. This value can be used to reproduce the same image when using identical parameters in another request.

NSFWContent boolean

If checkNSFW parameter is used, NSFWContent is included informing if the image has been flagged as potentially sensitive content.

true indicates the image has been flagged (is a sensitive image).
false indicates the image has not been flagged.

The filter occasionally returns false positives and very rarely false negatives.

cost float: if includeCost is set to true, the response will include a cost field for each task object. This field indicates the cost of the request in USD.