Image Masking API

Generate precise masks automatically for faces, hands, and people using AI detection. Enhance your inpainting workflow with smart, automated masking features.

Introduction

The Image Masking API provides intelligent detection and mask generation for specific elements in images, particularly optimized for faces, hands, and people. Built on advanced detection models, this feature enhances the inpainting workflow by automatically creating precise masks around detected elements, enabling targeted enhancement and detailing.

How it works

The masks generated by this API can be utilized in two powerful inpainting workflows, offering both convenience and advanced detail enhancement capabilities.

In the standard inpainting workflow, these automatically generated masks eliminate the need for manual mask creation. Once the mask is generated by selecting the appropriate detection model (face, hand, or person), it can be directly used in an inpainting request. This automation is particularly valuable for batch processing or when consistent mask creation is needed across multiple images.

However, the most powerful application comes when using these masks for detail enhancement through the inpainting process's zooming capability. In this workflow, after the mask is automatically generated around detected elements (like faces or hands), the inpainting model will zoom into the masked area when the maskMargin parameter is present. This parameter is crucial as it adds extra context pixels around the masked region. For example, if you're enhancing a face, `maskMargin ensures the model can see enough of the surrounding area to create coherent and well-integrated details.

The full process typically follows these steps:

  1. Submit the original image to the image masking API, specifying the desired detection model.
  2. Receive a precise mask identifying the target area (face, hand, or person).
  3. Use both the original image and the generated mask in an inpainting request, including the maskMargin parameter to enable zooming.
  4. The inpainting model will zoom into every masked region, considering the extra context area specified by maskMargin. This allows the model to add finer details to the masked areas and blend them smoothly into your original image for a natural, refined look.

This combination of automatic masking and zoom-enabled inpainting is particularly effective for enhancing specific features while maintaining natural integration with the surrounding image context.

Request

Our API always accepts an array of objects as input, where each object represents a specific task to be performed. The structure of the object varies depending on the type of the task. For this section, we will focus on the parameters related to image masking task.

The following JSON snippets shows the basic structure of a request object. All properties are explained in detail in the next section.

[
  {
    "taskType": "imageMasking",
    "taskUUID": "string",
    "inputImage": "string",
    "model": "string",
    "confidence": float,
    "maxDetections": int,
    "maskPadding": int,
    "maskBlur": int,
    "outputFormat": "string",
    "outputType": "string"
  }
]

taskType

string required

The type of task to be performed. For this task, the value should be imageMasking.

taskUUID

string required UUID v4

When a task is sent to the API you must include a random UUID v4 string using the taskUUID parameter. This string is used to match the async responses to their corresponding tasks.

If you send multiple tasks at the same time, the taskUUID will help you match the responses to the correct tasks.

The taskUUID must be unique for each task you send to the API.

outputType

"base64Data" | "dataURI" | "URL" Default: URL

Specifies the output type in which the image is returned. Supported values are: dataURI, URL, and base64Data.

  • base64Data: The image is returned as a base64-encoded string using the maskImageBase64Data parameter in the response object.
  • dataURI: The image is returned as a data URI string using the maskImageDataURI parameter in the response object.
  • URL: The image is returned as a URL string using the maskImageURL parameter in the response object.

outputFormat

"JPG" | "PNG" | "WEBP" Default: JPG

Specifies the format of the output image. Supported formats are: PNG, JPG and WEBP.

outputQuality

integer Min: 20 Max: 99 Default: 95

Sets the compression quality of the output image. Higher values preserve more quality but increase file size, lower values reduce file size but decrease quality.

webhookURL

string

Specifies a webhook URL where JSON responses will be sent via HTTP POST when generation tasks complete. For batch requests with multiple results, each completed item triggers a separate webhook call as it becomes available.

Webhooks can be secured using standard authentication methods supported by your endpoint, such as tokens in query parameters or API keys.

// Basic webhook endpoint
https://api.example.com/webhooks/runware

// With authentication token in query
https://api.example.com/webhooks/runware?token=your_auth_token

// With API key parameter
https://api.example.com/webhooks/runware?apiKey=sk_live_abc123

// With custom tracking parameters
https://api.example.com/webhooks/runware?projectId=proj_789&userId=12345

The webhook POST body contains the JSON response for the completed task according to your request configuration.

deliveryMethod

"sync" | "async" required Default: sync

Determines how the API delivers task results. Choose between immediate synchronous delivery or polling-based asynchronous delivery depending on your task requirements.

Sync mode ("sync"):

Returns complete results directly in the API response when processing completes within the timeout window. For long-running tasks like video generation or model uploads, the request will timeout before completion, though the task continues processing in the background and results remain accessible through the dashboard.

Async mode ("async"):

Returns an immediate acknowledgment with the task UUID, requiring you to poll for results using getResponse once processing completes. This approach prevents timeout issues and allows your application to handle other operations while waiting.

Polling workflow (async):

  1. Submit request with deliveryMethod: "async".
  2. Receive immediate response with the task UUID.
  3. Poll for completion using getResponse task.
  4. Retrieve final results when status shows "success".

When to use each mode:

  • Sync: Fast image generation, simple processing tasks.
  • Async: Video generation, model uploads, or any task that usually takes more than 60 seconds.

Async mode is required for computationally intensive operations to avoid timeout errors.

Specifies a URL where the generated content will be automatically uploaded using the HTTP PUT method. The raw binary data of the media file is sent directly as the request body. For secure uploads to cloud storage, use presigned URLs that include temporary authentication credentials.

Common use cases:

  • Cloud storage: Upload directly to S3 buckets, Google Cloud Storage, or Azure Blob Storage using presigned URLs.
  • CDN integration: Upload to content delivery networks for immediate distribution.
// S3 presigned URL for secure upload
https://your-bucket.s3.amazonaws.com/generated/content.mp4?X-Amz-Signature=abc123&X-Amz-Expires=3600

// Google Cloud Storage presigned URL
https://storage.googleapis.com/your-bucket/content.jpg?X-Goog-Signature=xyz789

// Custom storage endpoint
https://storage.example.com/uploads/generated-image.jpg

The content data will be sent as the request body to the specified URL when generation is complete.

includeCost

boolean Default: false

If set to true, the cost to perform the task will be included in the response object.

inputImage

string required

Specifies the input image to be processed for mask generation. The generated mask will identify specific elements in the image (faces, hands, or people) based on the selected detection model. The input image can be specified in one of the following formats:

  • An UUID v4 string of a previously uploaded image or a generated image.
  • A data URI string representing the image. The data URI must be in the format data:<mediaType>;base64, followed by the base64-encoded image. For example: data:image/png;base64,iVBORw0KGgo....
  • A base64 encoded image without the data URI prefix. For example: iVBORw0KGgo....
  • A URL pointing to the image. The image must be accessible publicly.

Supported formats are: PNG, JPG and WEBP.

model

string required

Specifies the specialized detection model to use for mask generation

Each model is optimized for specific use cases and offers different trade-offs between speed and accuracy.

View full face detection models
AIR ID Model Name Description
runware:35@1face_yolov8nLightweight model for 2D/realistic face detection.
runware:35@2face_yolov8sEnhanced face detection with improved accuracy.
runware:35@6mediapipe_face_fullSpecialized for realistic face detection.
runware:35@7mediapipe_face_shortOptimized face detection with reduced complexity.
runware:35@8mediapipe_face_meshAdvanced face detection with mesh mapping.
View facial features models
AIR ID Model Name Description
runware:35@9mediapipe_face_mesh_eyes_onlyFocused detection of eye regions.
runware:35@15eyes_mesh_mediapipeSpecialized eyes detection.
runware:35@13nose_mesh_mediapipeSpecialized nose detection.
runware:35@14lips_mesh_mediapipeSpecialized lips detection.
runware:35@10eyes_lips_meshDetection of eyes and lips areas.
runware:35@11nose_eyes_meshDetection of nose and eyes areas.
runware:35@12nose_lips_meshDetection of nose and lips areas.
View other body parts models
AIR ID Model Name Description
runware:35@3hand_yolov8nSpecialized for 2D/realistic hand detection.
runware:35@4person_yolov8n-segPerson detection and segmentation.
runware:35@5person_yolov8s-segAdvanced person detection with higher precision.
Learn more ⁨1⁩ resource

confidence

float Min: 0 Max: 1 Default: 0.25

Confidence threshold for detections. Only detections with confidence scores above this threshold will be included in the mask.

Lower confidence values will detect more objects but may introduce false positives.

maxDetections

integer Min: 1 Max: 20 Default: 6

Limits the maximum number of elements (faces, hands, or people) that will be detected and masked in the image. If there are more elements than this value, only the ones with highest confidence scores will be included.

maskPadding

integer Min: 0 Default: 4

Extends or reduces the detected mask area by the specified number of pixels. Positive values create a larger masked region (useful when you want to ensure complete coverage of the element), while negative values shrink the mask (useful for tighter, more focused areas).

maskBlur

integer Min: 0 Default: 4

Extends the mask by the specified number of pixels with a gradual fade-out effect, creating smooth transitions between masked and unmasked regions in the final result.

Note: The blur is always applied to the outer edge of the mask, regardless of whether maskPadding is used.

Response

Results will be delivered in the format below. It's possible to receive one or multiple images per message. This is due to the fact that images are generated in parallel, and generation time varies across nodes or the network.

{
  "data": [
    {
      "taskType": "imageMasking",
      "taskUUID": "d06e972d-dbfe-47d5-955f-c26e00ce4960",
      "imageUUID": "90422a52-f186-4bf4-a73b-0a46016a8330",
      "detections": [
        {
          "x_min": 505,
          "y_min": 237,
          "x_max": 588,
          "y_max": 337
        }
      ],
      "maskImageURL": "https://im.runware.ai/image/ws/0.5/ii/a770f077-f413-47de-9dac-be0b26a35da6.jpg",
      "cost": 0.0013
    }
  ]
}

taskType

string

The API will return the taskType you sent in the request. In this case, it will be imageMasking. This helps match the responses to the correct task type.

taskUUID

string UUID v4

The API will return the taskUUID you sent in the request. This way you can match the responses to the correct request tasks.

inputImageUUID

string UUID v4

The unique identifier of the original image used as input for the masking task.

detections

array

An array of objects containing the coordinates of each detected element in the image. Each object provides the bounding box coordinates of a detected face, hand, or person (depending on the model used).

Each detection object includes:

  • x_min: Leftmost coordinate of the detected area
  • y_min: Topmost coordinate of the detected area
  • x_max: Rightmost coordinate of the detected area
  • y_max: Bottommost coordinate of the detected area

These coordinates can be useful for further processing or for understanding the exact location of detected elements in the image.

maskImageUUID

string UUID v4

The unique identifier of the mask image.

maskImageURL

string

If outputType is set to URL, this parameter contains the URL of the mask image to be downloaded.

If outputType is set to base64Data, this parameter contains the base64-encoded data of the mask image.

If outputType is set to dataURI, this parameter contains the data URI of the mask image.

cost

float

if includeCost is set to true, the response will include a cost field for each task object. This field indicates the cost of the request in USD.