Image Inference API
Introduction
Image inference is a powerful feature that allows you to generate images from text prompts or transform existing images according to your needs. This process is essential for creating high-quality visuals, whether you're looking to bring creative ideas to life or enhance existing images with new styles or subjects.
There are several types of image inference requests you can make using our API:
- Text-to-image: Generate images from descriptive text prompts. This process translates your text into high-quality visuals, allowing you to create detailed and vivid images based on your ideas.
- Image-to-image: Perform transformations on existing images, whether they are previously generated images or uploaded images. This process enables you to enhance, modify, or stylize images to create new and visually appealing content. With a single parameter you can control the strength of the transformation.
- Inpainting: Replace parts of an image with new content, allowing you to remove unwanted elements or improve the overall composition of an image. It's like image-to-image but with a mask that defines the area to be transformed.
- Outpainting: Extend the boundaries of an image by generating new content outside the original frame that seamlessly blends with the existing image. As inpainting, it uses a mask to define the new area to be generated.
Our API also supports advanced features that allow developers to fine-tune the image generation process with precision:
- ControlNet: A feature that enables precise control over image generation by using additional input conditions, such as edge maps, poses, or segmentation masks. This allows for more accurate alignment with specific user requirements or styles.
- LoRA: A technique that helps in adapting models to specific styles or tasks by focusing on particular aspects of the data, enhancing the quality and relevance of the generated images.
Additionally, you can tweak numerous parameters to customize the output, such as adjusting the image dimension, steps, scheduler to use, and other generation settings, providing a high level of flexibility to suit your application's needs.
Our API is really fast because we have unique optimizations, custom-designed hardware, and many other elements that are part of our Sonic Inference Engine®.
Request
Our API always accepts an array of objects as input, where each object represents a specific task to be performed. The structure of the object varies depending on the type of the task. For this section, we will focus on the parameters related to image inference tasks.
The following JSON snippets shows the basic structure of a request object. All properties are explained in detail in the next section.
[
{
"taskType": "imageInference",
"taskUUID": "string",
"outputType": "string",
"outputFormat": "string",
"positivePrompt": "string",
"negativePrompt": "string",
"height": int,
"width": int,
"model": "string",
"steps": int,
"CFGScale": float,
"numberResults": int
}
]
[
{
"taskType": "imageInference",
"taskUUID": "string",
"positivePrompt": "string",
"seedImage": "string",
"model": "string",
"height": int,
"width": int,
"strength": float,
"numberResults": int
}
]
[
{
"taskType": "imageInference",
"taskUUID": "string",
"positivePrompt": "string",
"seedImage": "string",
"maskImage": "string",
"model": "string",
"height": int,
"width": int,
"strength": float,
"numberResults": int
}
]
[
{
"taskType": "imageInference",
"taskUUID": "string",
"positivePrompt": "string",
"seedImage": "string",
"outpaint": {
top: 64,
bottom: 64
},
"model": "string",
"height": int,
"width": int,
"strength": float,
"numberResults": int
}
]
[
{
"taskType": "imageInference",
"taskUUID": "string",
"positivePrompt": "string",
"model": "string",
"refiner": {
"model": "string",
"startStep": int
},
"height": int,
"width": int,
"numberResults": int
}
]
[
{
"taskType": "imageInference",
"taskUUID": "string",
"positivePrompt": "string",
"model": "string",
"height": int,
"width": int,
"numberResults": int,
"embeddings": [
{
"model": "string",
"weight": float
},
{
"model": "string",
"weight": float
}
]
}
]
[
{
"taskType": "imageInference",
"taskUUID": "string",
"positivePrompt": "string",
"model": "string",
"height": int,
"width": int,
"numberResults": int,
"controlNet": [
{
"model": "string",
"guideImage": "string",
"weight": float,
"startStep": int,
"endStep": int,
"controlMode": "string"
},
{
"model": "string",
"guideImage": "string",
"weight": float,
"startStep": int,
"endStep": int,
"controlMode": "string"
}
]
}
]
[
{
"taskType": "imageInference",
"taskUUID": "string",
"positivePrompt": "string",
"model": "string",
"height": int,
"width": int,
"numberResults": int,
"lora": [
{
"model": "string",
"weight": float
},
{
"model": "string",
"weight": float
}
]
}
]
[
{
"taskType": "imageInference",
"taskUUID": "string",
"positivePrompt": "string",
"model": "string",
"height": int,
"width": int,
"numberResults": int,
"ipAdapters": [
{
"model": "string",
"guideImage": "string",
"weight": float
},
{
"model": "string",
"guideImage": "string",
"weight": float
}
]
}
]
You can mix multiple ControlNet and LoRA objects in the same request to achieve more complex control over the generation process.
-
taskType
string required -
The type of task to be performed. For this task, the value should be
imageInference
.
-
taskUUID
string required UUID v4 -
When a task is sent to the API you must include a random UUID v4 string using the
taskUUID
parameter. This string is used to match the async responses to their corresponding tasks.If you send multiple tasks at the same time, the
taskUUID
will help you match the responses to the correct tasks.The
taskUUID
must be unique for each task you send to the API.
-
outputType
"base64Data" | "dataURI" | "URL" Default: URL -
Specifies the output type in which the image is returned. Supported values are:
dataURI
,URL
, andbase64Data
.base64Data
: The image is returned as a base64-encoded string using theimageBase64Data
parameter in the response object.dataURI
: The image is returned as a data URI string using theimageDataURI
parameter in the response object.URL
: The image is returned as a URL string using theimageURL
parameter in the response object.
-
outputFormat
"JPG" | "PNG" | "WEBP" Default: JPG -
Specifies the format of the output image. Supported formats are:
PNG
,JPG
andWEBP
.
-
outputQuality
integer Min: 20 Max: 99 Default: 95 -
Sets the compression quality of the output image. Higher values preserve more quality but increase file size, lower values reduce file size but decrease quality.
-
uploadEndpoint
string -
This parameter allows you to specify a URL to which the generated image will be uploaded as binary image data using the HTTP PUT method. For example, an S3 bucket URL can be used as the upload endpoint.
When the image is ready, it will be uploaded to the specified URL.
-
checkNSFW
boolean Default: false -
This parameter is used to enable or disable the NSFW check. When enabled, the API will check if the image contains NSFW (not safe for work) content. This check is done using a pre-trained model that detects adult content in images.
When the check is enabled, the API will return
NSFWContent: true
in the response object if the image is flagged as potentially sensitive content. If the image is not flagged, the API will returnNSFWContent: false
.If this parameter is not used, the parameter
NSFWContent
will not be included in the response object.Adds 0.1 seconds to image inference time and incurs additional costs.
The NSFW filter occasionally returns false positives and very rarely false negatives.
-
includeCost
boolean Default: false -
If set to
true
, the cost to perform the task will be included in the response object.
-
positivePrompt
string required -
A positive prompt is a text instruction to guide the model on generating the image. It is usually a sentence or a paragraph that provides positive guidance for the task. This parameter is essential to shape the desired results.
For example, if the positive prompt is "dragon drinking coffee", the model will generate an image of a dragon drinking coffee. The more detailed the prompt, the more accurate the results.
If you wish to generate an image without any prompt guidance, you can use the special token
__BLANK__
. This tells the system to generate an image without text-based instructions.The length of the prompt must be between 2 and 3000 characters.
Learn more 2 resources
-
negativePrompt
string -
A negative prompt is a text instruction to guide the model on generating the image. It is usually a sentence or a paragraph that provides negative guidance for the task. This parameter helps to avoid certain undesired results.
For example, if the negative prompt is "red dragon, cup", the model will follow the positive prompt but will avoid generating an image of a red dragon or including a cup. The more detailed the prompt, the more accurate the results.
The length of the prompt must be between 2 and 3000 characters.
Learn more 1 resource
-
seedImage
string required -
When doing image-to-image, inpainting or outpainting, this parameter is required.
Specifies the seed image to be used for the diffusion process. The image can be specified in one of the following formats:
- An UUID v4 string of a previously uploaded image or a generated image.
- A data URI string representing the image. The data URI must be in the format
data:<mediaType>;base64,
followed by the base64-encoded image. For example:...
. - A base64 encoded image without the data URI prefix. For example:
iVBORw0KGgo...
. - A URL pointing to the image. The image must be accessible publicly.
Supported formats are: PNG, JPG and WEBP.
Learn more 3 resources
-
maskImage
string required -
When doing inpainting, this parameter is required.
Specifies the mask image to be used for the inpainting process. The image can be specified in one of the following formats:
- An UUID v4 string of a previously uploaded image or a generated image.
- A data URI string representing the image. The data URI must be in the format
data:<mediaType>;base64,
followed by the base64-encoded image. For example:...
. - A base64 encoded image without the data URI prefix. For example:
iVBORw0KGgo...
. - A URL pointing to the image. The image must be accessible publicly.
Supported formats are: PNG, JPG and WEBP.
Learn more 1 resource
-
maskMargin
integer Min: 32 Max: 128 -
Adds extra context pixels around the masked region during inpainting. When this parameter is present, the model will zoom into the masked area, considering these additional pixels to create more coherent and well-integrated details.
This parameter is particularly effective when used with masks generated by the Image Masking API, enabling enhanced detail generation while maintaining natural integration with the surrounding image.
Learn more 1 resource
-
strength
float Min: 0 Max: 1 Default: 0.8 -
When doing image-to-image or inpainting, this parameter is used to determine the influence of the
seedImage
image in the generated output. A lower value results in more influence from the original image, while a higher value allows more creative deviation.Learn more 3 resources
-
referenceImages
string[] -
An array containing reference images used to condition the generation process. These images provide visual guidance to help the model generate content that aligns with the style, composition, or characteristics of the reference materials.
This parameter is particularly useful with edit models like FLUX.1 Kontext, where reference images can guide the generation toward specific visual attributes or maintain consistency with existing content. Each image can be specified in one of the following formats:
- An UUID v4 string of a previously uploaded image or a generated image.
- A data URI string representing the image. The data URI must be in the format
data:<mediaType>;base64,
followed by the base64-encoded image. For example:...
. - A base64 encoded image without the data URI prefix. For example:
iVBORw0KGgo...
. - A URL pointing to the image. The image must be accessible publicly.
Supported formats are: PNG, JPG and WEBP.
View model compatibility
Model Architecture Max Images FLUX.1 Kontext 1 Ace++ 1 Other models Not supported
-
outpaint
object -
Extends the image boundaries in specified directions. When using
outpaint
, you must provide the final dimensions usingwidth
andheight
parameters, which should account for the original image size plus the total extension (seedImage dimensions + top + bottom, left + right).View example
{ "taskType": "imageInference", "taskUUID": "d06e972d-dbfe-47d5-955f-c26e00ce4959", "positivePrompt": "a beautiful landscape with mountains and trees", "negativePrompt": "blurry, bad quality", "seedImage": "59a2edc2-45e6-429f-be5f-7ded59b92046", "model": "civitai:4201@130090", "height": 1024, "width": 768, "steps": 20, "strength": 0.7, "outpaint": { "top": 256, "right": 128, "bottom": 256, "left": 128, "blur": 16 } }
Learn more 1 resource
Properties 5 properties
-
outpaint[i]
»top
top
integer Min: 0 -
Number of pixels to extend at the top of the image. Must be a multiple of 64.
-
outpaint[i]
»right
right
integer Min: 0 -
Number of pixels to extend at the right side of the image. Must be a multiple of 64.
-
outpaint[i]
»bottom
bottom
integer Min: 0 -
Number of pixels to extend at the bottom of the image. Must be a multiple of 64.
-
outpaint[i]
»left
left
integer Min: 0 -
Number of pixels to extend at the left side of the image. Must be a multiple of 64.
-
outpaint[i]
»blur
blur
integer Min: 0 Max: 32 Default: 0 -
The amount of blur to apply at the boundaries between the original image and the extended areas, measured in pixels.
Learn more 1 resource
-
-
height
integer required Min: 128 Max: 2048 -
Used to define the height dimension of the generated image. Certain models perform better with specific dimensions.
The value must be divisible by 64, eg: 128...512, 576, 640...2048.
Learn more 2 resources
-
width
integer required Min: 128 Max: 2048 -
Used to define the width dimension of the generated image. Certain models perform better with specific dimensions.
The value must be divisible by 64, eg: 128...512, 576, 640...2048.
Learn more 2 resources
-
model
string required -
We make use of the AIR (Artificial Intelligence Resource) system to identify models. This identifier is a unique string that represents a specific model.
You can find the AIR identifier of the model you want to use in our Model Explorer, which is a tool that allows you to search for models based on their characteristics.
Learn more 3 resources
-
vae
string -
We make use of the AIR (Artificial Intelligence Resource) system to identify VAE models. This identifier is a unique string that represents a specific model.
The VAE (Variational Autoencoder) can be specified to override the default one included with the base model, which can help improve the quality of generated images.
You can find the AIR identifier of the VAE model you want to use in our Model Explorer, which is a tool that allows you to search for models based on their characteristics.
Learn more 1 resource
-
steps
integer Min: 1 Max: 100 Default: 20 -
The number of steps is the number of iterations the model will perform to generate the image. The higher the number of steps, the more detailed the image will be. However, increasing the number of steps will also increase the time it takes to generate the image and may not always result in a better image (some schedulers work differently).
When using your own models you can specify a new default value for the number of steps.
Learn more 1 resource
-
scheduler
string Default: Model's scheduler -
An scheduler is a component that manages the inference process. Different schedulers can be used to achieve different results like more detailed images, faster inference, or more accurate results.
The default scheduler is the one that the model was trained with, but you can choose a different one to get different results.
Schedulers are explained in more detail in the Schedulers page.
Learn more 2 resources
-
seed
integer Min: 1 Max: 9223372036854776000 Default: Random -
A seed is a value used to randomize the image generation. If you want to make images reproducible (generate the same image multiple times), you can use the same seed value.
When requesting multiple images with the same seed, the seed will be incremented by
1
(+1) for each image generated.Learn more 1 resource
-
CFGScale
float Min: 0 Max: 50 Default: 7 -
Guidance scale represents how closely the images will resemble the prompt or how much freedom the AI model has. Higher values are closer to the prompt. Low values may reduce the quality of the results.
Learn more 1 resource
-
clipSkip
integer Min: 0 Max: 2 -
Defines additional layer skips during prompt processing in the CLIP model. Some models already skip layers by default, this parameter adds extra skips on top of those. Different values affect how your prompt is interpreted, which can lead to variations in the generated image.
Learn more 2 resources
-
promptWeighting
string -
Defines the syntax to be used for prompt weighting.
Prompt weighting allows you to adjust how strongly different parts of your prompt influence the generated image. Choose between
compel
notation with advanced weighting operations orsdEmbeds
for simple emphasis adjustments.View Compel syntax
Adds 0.2 seconds to image inference time and incurs additional costs.
When
compel
syntax is selected, you can use the following notation in prompts:Weighting
Syntax:
+
-
(word)0.9
Increase or decrease the attention given to specific words or phrases.
Examples:
- Single words:
small+ dog, pixar style
- Multiple words:
small dog, (pixar style)-
- Multiple symbols for more effect:
small+++ dog, pixar style
- Nested weighting:
(small+ dog)++, pixar style
- Explicit weight percentage:
small dog, (pixar)1.2 style
Blend
Syntax:
.blend()
Merge multiple conditioning prompts.
Example:
("small dog", "robot").blend(1, 0.8)
Conjunction
Syntax:
.and()
Break a prompt into multiple clauses and pass them separately.
Example:
("small dog", "pixar style").and()
View sdEmbeds syntax
When
sdEmbeds
syntax is selected, you can use the following notation in prompts:Weighting
Syntax:
(text)
(text:number)
[text]
Use parentheses
()
to increase attention, square brackets[]
to decrease it. Add a number after the text to specify a custom multiplier.Examples:
- Single words:
(small) dog, pixar style
- Multiple words:
small dog, [pixar style]
- Higher emphasis:
(small:2.5) dog, pixar style
- Combined emphasis:
(small dog:1.5), pixar style
- Single words:
-
numberResults
integer Min: 1 Max: 20 Default: 1 -
The number of images to generate from the specified prompt.
If seed is set, it will be incremented by
1
(+1) for each image generated.
-
advancedFeatures
object -
A container for specialized features that extend the functionality of the image generation process. This object groups advanced capabilities that enhance specific aspects of the generation pipeline.
Properties 1 property
-
advancedFeatures[i]
»layerDiffuse
layerDiffuse
boolean Default: false -
Enables LayerDiffuse technology, which allows for the direct generation of images with transparency (alpha channels).
When enabled, this feature applies the necessary LoRA and VAE components to produce high-quality transparent images without requiring post-processing background removal.
This is particularly useful for creating product images, overlays, composites, and other content that requires transparency. The output must be in a format that supports transparency, such as PNG.
Note: This feature is only available for the FLUX model architecture. It automatically applies the equivalent of:
"lora": [{ "model": "runware:120@2" }], "vae": "runware:120@4"
View example
{ "taskType": "imageInference", "taskUUID": "991e641a-d2a8-4aa3-9883-9d6fe230fff8", "outputFormat": "PNG", "positivePrompt": "a crystal glass on a white background", "height": 1024, "width": 1024, "advancedFeatures": { "layerDiffuse": true }, "model": "runware:101@1" }
Learn more 1 resource
-
-
acceleratorOptions
object -
Advanced caching mechanisms to significantly speed up image generation by reducing redundant computation. This object allows you to enable and configure acceleration technologies for your specific model architecture.
These caching methods will not perform well with stochastic schedulers (those with
SDE
orAncestral
in the name). The random noise added by these schedulers prevents the cache from working effectively. For best results, use deterministic schedulers likeEuler
orDDIM
.Properties 5 properties
-
acceleratorOptions[i]
»teaCache
teaCache
boolean Default: false -
Enables or disables the TeaCache feature, which accelerates image generation by reusing past computations.
TeaCache is specifically designed for transformer-based models such as Flux and SD 3, and does not work with UNet models like SDXL or SD 1.5.
This feature is particularly effective for iterative image editing and prompt refinement workflows.
View example
"acceleratorOptions": { "teaCache": true, "teaCacheDistance": 0.6 }
-
acceleratorOptions[i]
»teaCacheDistance
teaCacheDistance
float Min: 0 Max: 1 Default: 0.5 -
Controls the aggressiveness of the TeaCache feature. Values range from 0.0 (most conservative) to 1.0 (most aggressive).
Lower values prioritize quality by being more selective about which computations to reuse, while higher values prioritize speed by reusing more computations.
Example: A value of 0.1 is very conservative, maintaining high quality with modest speed improvements, while 0.6 is more aggressive, yielding greater speed gains with potential minor quality trade-offs.
-
acceleratorOptions[i]
»deepCache
deepCache
boolean Default: false -
Enables or disables the DeepCache feature, which speeds up diffusion-based image generation by caching internal feature maps from the neural network.
DeepCache is designed for UNet-based models like SDXL and SD 1.5, and is not applicable to transformer-based models like Flux and SD 3.
DeepCache can provide significant performance improvements for high-throughput scenarios or when generating multiple similar images.
View example
"acceleratorOptions": { "deepCache": true, "deepCacheInterval": 3, "deepCacheBranchId": 0 }
-
acceleratorOptions[i]
»deepCacheInterval
deepCacheInterval
integer Min: 1 Default: 3 -
Represents the frequency of feature caching, specified as the number of steps between each cache operation.
A larger interval value will make inference faster but may impact quality. A smaller interval prioritizes quality over speed.
-
acceleratorOptions[i]
»deepCacheBranchId
deepCacheBranchId
integer Min: 0 Default: 0 -
Determines which branch of the network (ordered from the shallowest to the deepest layer) is responsible for executing the caching processes.
Lower branch IDs (e.g., 0) result in more aggressive caching for faster generation, while higher branch IDs produce more conservative caching with potentially higher quality results.
-
-
puLID
object -
PuLID (Pure and Lightning ID Customization) enables fast and high-quality identity customization for text-to-image generation. This object allows you to configure settings for transferring facial characteristics from a reference image to generated images with high fidelity.
View example
{ "taskType": "imageInference", "taskUUID": "991e641a-d2a8-4aa3-9883-9d6fe230fff8", "positivePrompt": "portrait, color, cinematic, in garden, soft light, detailed face", "height": 1024, "width": 1024, "model": "runware:101@1", "puLID": { "inputImages": ["59a2edc2-45e6-429f-be5f-7ded59b92046"], "idWeight": 1, "trueCFGScale": 1.5, "CFGStartStep": 3 } }
Properties 5 properties
-
puLID[i]
»inputImages
inputImages
string[] required Min: 1 Max: 1 -
An array containing the reference image used for identity customization. The reference image provides the facial characteristics that will be preserved and integrated into the generated images.
Currently, only a single image is supported, so the array should contain exactly one element with a clear, high-quality face that will serve as the identity source.
The image can be specified in one of the following formats:
- An UUID v4 string of a previously uploaded image or a generated image.
- A data URI string representing the image. The data URI must be in the format
data:<mediaType>;base64,
followed by the base64-encoded image. For example:...
. - A base64 encoded image without the data URI prefix. For example:
iVBORw0KGgo...
. - A URL pointing to the image. The image must be accessible publicly.
Supported formats are: PNG, JPG and WEBP.
-
puLID[i]
»idWeight
idWeight
integer Min: 0 Max: 3 Default: 1 -
Controls the strength of identity preservation in the generated image. Higher values create outputs that more closely resemble the facial characteristics of the input image, while lower values allow for more creative interpretation while still maintaining some identity features.
-
puLID[i]
»trueCFGScale
trueCFGScale
float Min: 0 Max: 10 -
Controls the guidance scale specifically for PuLID's identity embedding process. This parameter modifies how closely the generated image follows the identity characteristics from the reference image while balancing prompt adherence.
Higher values result in stronger identity preservation and more faithful reproduction of facial features from the reference image. Lower values allow for more creative interpretation while still maintaining recognizable identity features.
This parameter works in conjunction with the main CFGScale parameter but specifically targets the identity embedding component of the generation process.
-
puLID[i]
»CFGStartStep
CFGStartStep
integer Min: 0 Max: 10 -
Alternative parameters:
puLID.startStepPercentageCFG
.Controls when identity features begin to influence the image generation process.
Lower values apply identity features earlier in the generation process, resulting in stronger resemblance to the reference face but with less creative freedom in composition and style. Higher values do the opposite.
For photorealistic images, starting as early as possible typically works best. For stylized images (cartoon, anime, etc.), starting a bit later can provide better results.
-
puLID[i]
»CFGStartStepPercentage
CFGStartStepPercentage
integer Min: 0 Max: 100 -
Alternative parameters:
puLID.startStepCFG
.Determines at what percentage of the total generation steps the identity features begin to influence the image.
Lower percentages apply identity features earlier in the generation process, creating stronger resemblance to the reference face but with less creative freedom in composition and style. Higher percentages do the opposite.
For photorealistic images, starting as early as possible typically works best. For stylized images (cartoon, anime, etc.), starting a bit later can provide better results.
-
-
acePlusPlus
object -
ACE++ is an advanced framework for character-consistent image generation and editing. It supports two distinct workflows: creating new images guided by a reference image, and editing existing images with precise control over specific regions.
Note: When using the
acePlusPlus
object, you must set the model parameter torunware:102@1
(FLUX Fill).The referenceImages parameter is required when using ACE++ and must be specified at the root level of the request, outside of the
acePlusPlus
object.View examples
Creation Workflow: Generate new images that maintain the style, identity, or characteristics from a reference image. The model extracts visual features from the reference image and combines them with the text prompt to condition the generation process.
{ "taskType": "imageInference", "taskUUID": "991e641a-d2a8-4aa3-9883-9d6fe230fff8", "positivePrompt": "photo of man wearing a suit", "height": 1024, "width": 1024, "model": "runware:102@1", "referenceImages": ["59a2edc2-45e6-429f-be5f-7ded59b92046"], "acePlusPlus": { "type": "portrait", "repaintingScale": 0.5 } }
Editing Workflow: Modify specific regions of an existing image using guidance from a reference image. Uses an input mask to define the exact area to be edited while preserving the rest of the image unchanged.
{ "taskType": "imageInference", "taskUUID": "991e641a-d2a8-4aa3-9883-9d6fe230fff8", "positivePrompt": "photo of man wearing a white t-shirt", "height": 1024, "width": 1024, "model": "runware:102@1", "referenceImages": ["59a2edc2-45e6-429f-be5f-7ded59b92046"], "acePlusPlus": { "type": "local_editing", "inputImages": ["59a2edc2-45e6-429f-be5f-7ded59b92046"], "inputMasks": ["90422a52-f186-4bf4-a73b-0a46016a8330"], "repaintingScale": 0.7 } }
Properties 4 properties
-
acePlusPlus[i]
»type
type
string required Default: portrait -
Specifies the nature of the image processing task, which determines the appropriate model configuration and LoRA weights to use within the ACE++ framework.
Available task types:
-
portrait
: Ensures consistency in facial features across different images, maintaining identity and expression. Ideal for generating consistent character appearances in various settings. -
subject
: Maintains consistency of specific subjects (objects, logos, etc.) across different scenes or contexts. Perfect for placing logos consistently on various products or backgrounds. -
local_editing
: Facilitates localized editing of images, allowing modification of specific regions while preserving the overall structure. Used for targeted edits like changing object colors or altering facial features.
Each task type automatically applies the corresponding specialized LoRA model optimized for that specific use case.
-
-
acePlusPlus[i]
»inputImages
inputImages
string[] Max: 1 -
An array containing the reference image(s) used for character identity. Each input image must contain a single, clear face of the subject.
Currently, only a single image is supported, so the array should contain exactly one element.
This reference images provides the character identity (face, style, etc.) that will be preserved during generation or editing.
The images can be specified in one of the following formats:
- An UUID v4 string of a previously uploaded image or a generated image.
- A data URI string representing the image. The data URI must be in the format
data:<mediaType>;base64,
followed by the base64-encoded image. For example:...
. - A base64 encoded image without the data URI prefix. For example:
iVBORw0KGgo...
. - A URL pointing to the image. The image must be accessible publicly.
Supported formats are: PNG, JPG and WEBP.
-
acePlusPlus[i]
»inputMasks
inputMasks
string[] Max: 1 -
An array containing the mask image(s) used for selective editing.
Currently, only a single mask is supported, so if provided, the array should contain exactly one element.
This parameter is used only in editing operations. The mask specifies which areas of the image should be edited based on the prompt, while preserving the rest of the image. The mask image can be specified in the same formats as
inputImages
.The mask should be a black and white image where white (255) represents the areas to be edited and black (0) represents the areas to be preserved.
The mask images can be specified in one of the following formats:
- An UUID v4 string of a previously uploaded image or a generated image.
- A data URI string representing the image. The data URI must be in the format
data:<mediaType>;base64,
followed by the base64-encoded image. For example:...
. - A base64 encoded image without the data URI prefix. For example:
iVBORw0KGgo...
. - A URL pointing to the image. The image must be accessible publicly.
Supported formats are: PNG, JPG and WEBP.
-
acePlusPlus[i]
»repaintingScale
repaintingScale
float Min: 0 Max: 1 Default: 0 -
Controls the balance between preserving the original character identity and following the prompt instructions.
A value of 0.0 gives maximum priority to character identity preservation, while a value of 1.0 gives maximum priority to following the prompt instructions.
For subtle changes while maintaining strong character resemblance, use lower values.
-
-
refiner
object -
Refiner models help create higher quality image outputs by incorporating specialized models designed to enhance image details and overall coherence. This can be particularly useful when you need results with superior quality, photorealism, or specific aesthetic refinements. Note that refiner models are only SDXL based.
The
refiner
parameter is an object that contains properties defining how the refinement process should be configured. You can find the properties of the refiner object below.View example
{ "taskType": "imageInference", "taskUUID": "string", "positivePrompt": "string", "model": "string", "height": int, "width": int, "numberResults": int, "refiner": { "model": "string", "startStep": int } }
Learn more 1 resource
Properties 3 properties
-
refiner[i]
»model
model
string required -
We make use of the AIR system to identify refiner models. This identifier is a unique string that represents a specific model. Note that refiner models are only SDXL based.
You can find the AIR identifier of the refiner model you want to use in our Model Explorer, which is a tool that allows you to search for models based on their characteristics.
The official SDXL refiner model is
civitai:101055@128080
.More information about the AIR system can be found in the Models page.
-
refiner[i]
»startStep
startStep
integer Min: 2 Max: {steps} -
Alternative parameters:
refiner.startStepPercentage
.Represents the step number at which the refinement process begins. The initial model will generate the image up to this step, after which the refiner model takes over to enhance the result.
It can take values from
2
(second step) to the number of steps specified.
-
refiner[i]
»startStepPercentage
startStepPercentage
integer Min: 1 Max: 99 -
Alternative parameters:
refiner.startStep
.Represents the percentage of total steps at which the refinement process begins. The initial model will generate the image up to this percentage of steps before the refiner takes over.
It can take values from
1
to99
.
-
-
embeddings
object[] -
Embeddings (or Textual Inversion) can be used to add specific concepts or styles to your generations. Multiple embeddings can be used at the same time.
The
embeddings
parameter is an array of objects. Each object contains properties that define which embedding model to use. You can find the properties of the embeddings object below.View example
{ "taskType": "imageInference", "taskUUID": "string", "positivePrompt": "string", "model": "string", "height": int, "width": int, "numberResults": int, "embeddings": [ { "model": "string", }, { "model": "string", } ] }
Learn more 1 resource
Array items 2 properties each
-
embeddings[i]
»model
model
string required -
We make use of the AIR system to identify embeddings models. This identifier is a unique string that represents a specific model.
You can find the AIR identifier of the embeddings model you want to use in our Model Explorer, which is a tool that allows you to search for models based on their characteristics.
-
embeddings[i]
»weight
weight
float Min: -4 Max: 4 Default: 1 -
Defines the strength or influence of the embeddings model in the generation process. The value can range from -4 (negative influence) to +4 (maximum influence).
It is possible to use multiple embeddings at the same time.
Example:
"embeddings": [ { "model": "civitai:1044536@1172007", "weight": 1.5 }, { "model": "civitai:993446@1113094", "weight": 0.8 } ]
-
-
controlNet
object[] -
With ControlNet, you can provide a guide image to help the model generate images that align with the desired structure. This guide image can be generated with our ControlNet preprocessing tool, extracting guidance information from an input image. The guide image can be in the form of an edge map, a pose, a depth estimation or any other type of control image that guides the generation process via the ControlNet model.
Multiple ControlNet models can be used at the same time to provide different types of guidance information to the model.
The
controlNet
parameter is an array of objects. Each object contains properties that define the configuration for a specific ControlNet model. You can find the properties of the ControlNet object below.View example
{ "taskType": "imageInference", "taskUUID": "string", "positivePrompt": "string", "model": "string", "height": int, "width": int, "numberResults": int, "controlNet": [ { "model": "string", "guideImage": "string", "weight": float, "startStep": int, "endStep": int, "controlMode": "string" }, { "model": "string", "guideImage": "string", "weight": float, "startStep": int, "endStep": int, "controlMode": "string" } ] }
Learn more 2 resources
Array items 8 properties each
-
controlNet[i]
»model
model
string required -
For basic/common ControlNet models, you can check the list of available models here.
For custom or specific ControlNet models, we make use of the AIR system to identify ControlNet models. This identifier is a unique string that represents a specific model.
You can find the AIR identifier of the ControlNet model you want to use in our Model Explorer, which is a tool that allows you to search for models based on their characteristics.
More information about the AIR system can be found in the Models page.
-
controlNet[i]
»guideImage
guideImage
string required -
Specifies the preprocessed image to be used as guide to control the image generation process. The image can be specified in one of the following formats:
- An UUID v4 string of a previously uploaded image or a generated image.
- A data URI string representing the image. The data URI must be in the format
data:<mediaType>;base64,
followed by the base64-encoded image. For example:...
. - A base64 encoded image without the data URI prefix. For example:
iVBORw0KGgo...
. - A URL pointing to the image. The image must be accessible publicly.
Supported formats are: PNG, JPG and WEBP.
-
controlNet[i]
»weight
weight
float Min: 0 Max: 1 Default: 1 -
Represents the strength or influence of this ControlNet model in the generation process. A value of 0 means no influence, while 1 means maximum influence.
-
controlNet[i]
»startStep
startStep
integer Min: 1 Max: {steps} -
Alternative parameters:
controlNet.startStepPercentage
.Represents the step number at which the ControlNet model starts to control the inference process.
It can take values from
1
(first step) to the number of steps specified.Learn more 1 resource
-
controlNet[i]
»startStepPercentage
startStepPercentage
integer Min: 0 Max: 99 -
Alternative parameters:
controlNet.startStep
.Represents the percentage of steps at which the ControlNet model starts to control the inference process.
It can take values from
0
to99
.Learn more 1 resource
-
controlNet[i]
»endStep
endStep
integer Min: {startStep + 1} Max: {steps} -
Alternative parameters:
controlNet.endStepPercentage
.Represents the step number at which the ControlNet preprocessor ends to control the inference process.
It can take values higher than startStep and less than or equal to the number of steps specified.
Learn more 1 resource
-
controlNet[i]
»endStepPercentage
endStepPercentage
integer Min: {startStepPercentage + 1} Max: 100 -
Alternative parameters:
controlNet.endStep
.Represents the percentage of steps at which the ControlNet model ends to control the inference process.
It can take values higher than startStepPercentage and lower than or equal to
100
.Learn more 1 resource
-
controlNet[i]
»controlMode
controlMode
string -
This parameter has 3 options:
prompt
,controlnet
andbalanced
.prompt
: Prompt is more important in guiding image generation.controlnet
: ControlNet is more important in guiding image generation.balanced
: Balanced operation of prompt and ControlNet.
-
-
lora
object[] -
With LoRA (Low-Rank Adaptation), you can adapt a model to specific styles or features by emphasizing particular aspects of the data. This technique enhances the quality and relevance of the generated images and can be especially useful in scenarios where the generated images need to adhere to a specific artistic style or follow particular guidelines.
Multiple LoRA models can be used at the same time to achieve different adaptation goals.
The
lora
parameter is an array of objects. Each object contains properties that define the configuration for a specific LoRA model. You can find the properties of the LoRA object below.View example
{ "taskType": "imageInference", "taskUUID": "string", "positivePrompt": "string", "model": "string", "height": int, "width": int, "numberResults": int, "lora": [ { "model": "string", "weight": float }, { "model": "string", "weight": float } ] }
Learn more 1 resource
Array items 2 properties each
-
lora[i]
»model
model
string required -
We make use of the AIR system to identify LoRA models. This identifier is a unique string that represents a specific model.
You can find the AIR identifier of the LoRA model you want to use in our Model Explorer, which is a tool that allows you to search for models based on their characteristics.
More information about the AIR system can be found in the Models page.
Example:
civitai:132942@146296
.
-
lora[i]
»weight
weight
float Min: -4 Max: 4 Default: 1 -
Defines the strength or influence of the LoRA model in the generation process. The value can range from -4 (negative influence) to +4 (maximum influence).
It is possible to use multiple LoRAs at the same time.
View example
"lora": [ { "model": "runware:13090@1", "weight": 1.5 }, { "model": "runware:6638@1", "weight": 0.8 } ]
-
-
ipAdapters
object[] -
IP-Adapters enable image-prompted generation, allowing you to use reference images to guide the style and content of your generations. Multiple IP Adapters can be used simultaneously.
The
ipAdapters
parameter is an array of objects. Each object contains properties that define which IP-Adapter model to use and how it should influence the generation. You can find the properties of the IP-Adapter object below.View example
{ "taskType": "imageInference", "taskUUID": "string", "positivePrompt": "string", "model": "string", "height": int, "width": int, "numberResults": int, "ipAdapters": [ { "model": "string", "guideImage": "string", "weight": "float", }, { "model": "string", "guideImage": "string", "weight": "float", } ] }
Learn more 1 resource
Array items 3 properties each
-
ipAdapters[i]
»model
model
string required -
We make use of the AIR system to identify IP-Adapter models. This identifier is a unique string that represents a specific model.
Supported models list
AIR ID Model Name runware:55@1 IP Adapter SDXL runware:55@2 IP Adapter SDXL Plus runware:55@3 IP Adapter SDXL Plus Face runware:55@4 IP Adapter SDXL Vit-H runware:55@5 IP Adapter SD 1.5 runware:55@6 IP Adapter SD 1.5 Plus runware:55@7 IP Adapter SD 1.5 Light runware:55@8 IP Adapter SD 1.5 Plus Face runware:55@10 IP Adapter SD 1.5 Vit-G
-
ipAdapters[i]
»guideImage
guideImage
string required -
Specifies the reference image that will guide the generation process. The image can be specified in one of the following formats:
- An UUID v4 string of a previously uploaded image or a generated image.
- A data URI string representing the image. The data URI must be in the format
data:<mediaType>;base64,
followed by the base64-encoded image. For example:...
. - A base64 encoded image without the data URI prefix. For example:
iVBORw0KGgo...
. - A URL pointing to the image. The image must be accessible publicly.
Supported formats are: PNG, JPG and WEBP.
-
ipAdapters[i]
»weight
weight
float Min: 0 Max: 1 Default: 1 -
Represents the strength or influence of this IP-Adapter in the generation process. A value of 0 means no influence, while 1 means maximum influence.
-
Response
Results will be delivered in the format below. It's possible to receive one or multiple images per message. This is due to the fact that images are generated in parallel, and generation time varies across nodes or the network.
{
"data": [
{
"taskType": "imageInference",
"taskUUID": "a770f077-f413-47de-9dac-be0b26a35da6",
"imageUUID": "77da2d99-a6d3-44d9-b8c0-ae9fb06b6200",
"imageURL": "https://im.runware.ai/image/ws/0.5/ii/a770f077-f413-47de-9dac-be0b26a35da6.jpg",
"cost": 0.0013
}
]
}
-
taskType
string -
The API will return the
taskType
you sent in the request. In this case, it will beimageInference
. This helps match the responses to the correct task type.
-
taskUUID
string UUID v4 -
The API will return the
taskUUID
you sent in the request. This way you can match the responses to the correct request tasks.
-
imageUUID
string UUID v4 -
The unique identifier of the image.
-
imageURL
string -
If
outputType
is set toURL
, this parameter contains the URL of the image to be downloaded.
-
imageBase64Data
string -
If
outputType
is set tobase64Data
, this parameter contains the base64-encoded image data.
-
imageDataURI
string -
If
outputType
is set todataURI
, this parameter contains the data URI of the image.
-
seed
integer -
The seed value that was used to generate this image. This value can be used to reproduce the same image when using identical parameters in another request.
-
NSFWContent
boolean -
If checkNSFW parameter is used,
NSFWContent
is included informing if the image has been flagged as potentially sensitive content.true
indicates the image has been flagged (is a sensitive image).false
indicates the image has not been flagged.
The filter occasionally returns false positives and very rarely false negatives.
-
cost
float -
if
includeCost
is set totrue
, the response will include acost
field for each task object. This field indicates the cost of the request in USD.