VAE: Visual decoder for image generation

The decoder that converts the model's latent representation into the final image, affecting color fidelity and detail.

Introduction

The vae parameter specifies which Variational Autoencoder to use for converting the model's internal representations into the final image.

A VAE consists of two parts:

  • An encoder that compresses images into a low-dimensional latent space (used during model training).
  • A decoder that reconstructs images from latent representations (used during inference).

Diffusion models don't work directly with pixels during generation. They operate in a compressed latent space, a lower-dimensional representation that's more computationally efficient. The VAE's decoder handles the final step of converting these latent representations back into visible pixels with proper colors, textures, and details.

Most models ship with a default VAE that works well for general use. The vae parameter lets you override that default with an alternative decoder, which can improve color accuracy, reduce artifacts, or better handle specific content types.

Anime-style girl with pink twin tails and violet eyes, default SDXL VAE
Default SDXL VAE Luna XL VAE

Request structure

The vae parameter is a string model identifier passed at the top level of your generation request.

[
  {
    "taskType": "imageInference",
    "model": "civitai:101055@128078",
    "positivePrompt": "Smiling anime girl with pink twin tails and violet eyes",
    "vae": "civitai:311162@401206",
    "steps": 30,
    "width": 1024,
    "height": 1024
  }
]

What custom VAEs affect

Custom VAEs can change several aspects of the final image:

  • Color reproduction: Different VAEs produce different color profiles. Some boost saturation, others improve color accuracy. The default SDXL VAE, for example, is known to occasionally produce washed-out or slightly desaturated colors, which is why community VAEs exist to correct this.
  • Detail preservation: Some VAEs better preserve fine details in the latent-to-pixel conversion, resulting in sharper textures and clearer edges.
  • Artifact reduction: Specialized VAEs can reduce common issues like color banding in gradients, blotchy skin tones, or NaN color artifacts (black spots) that the default VAE sometimes produces.

When to change the VAE

For most workflows, the default VAE is fine. Consider changing it when:

  • You notice washed-out colors or desaturation in your output, especially with SDXL models.
  • You see black spots or NaN artifacts, which indicate the default VAE's fp16 precision issue.
  • You're generating anime or stylized content and want bolder colors or cleaner flat fills.
  • You're getting color banding in smooth gradients (skies, skin, backgrounds).

If you're not experiencing any of these issues, there's no need to override the default.

Custom VAEs are primarily used with SD 1.5 and SDXL architectures. Third-party models like Recraft and Ideogram use their own integrated decoding methods and don't expose VAE customization.

Tips

  1. Try a VAE before blaming the model. If your SDXL output looks washed out or has black spots, swapping the VAE is often faster and cheaper than switching models.
  2. Match the VAE to the architecture. SD 1.5 VAEs won't work with SDXL models and vice versa. Always check compatibility before applying.
  3. Don't stack VAE changes with other fixes. If you're troubleshooting color issues, change the VAE in isolation first. Adjusting the VAE, prompt, and CFG simultaneously makes it impossible to identify what fixed the problem.