CFG Scale: Balancing creativity and prompt adherence

Controls how strictly the model follows your prompt, balancing creativity with adherence.

Introduction

The CFGScale (Classifier-Free Guidance Scale) parameter controls how strictly the model follows your prompt during generation. It acts as a weighting factor that balances creative freedom against prompt adherence.

At each step of the denoising process, the model computes two predictions:

  1. Unconditioned prediction: What the model would generate with an empty prompt.
  2. Conditioned prediction: What the model would generate following your specific prompt.

CFG Scale amplifies the difference between these two predictions, pushing the generation toward what your prompt describes. Higher values give more weight to the prompt at the expense of natural-looking output. Setting CFG Scale to 0 or 1 effectively disables guidance and lets the model generate freely.

While CFG originated in image diffusion models, the concept of balancing prompt adherence with creative latitude appears across modalities. Some video models expose a cfgScale or guidanceScale parameter that works on the same principle: higher values produce output that follows the prompt more literally, lower values let the model improvise.

0

Request structure

The CFGScale parameter is a number passed at the top level of your generation request.

[
  {
    "taskType": "imageInference",
    "model": "civitai:101055@128078",
    "positivePrompt": "A crystal-clear lake surrounded by snow-capped mountains",
    "CFGScale": 7,
    "steps": 30,
    "width": 1024,
    "height": 1024
  }
]

What happens at extremes

Too low (1-2 on SD 1.5): The model mostly ignores your prompt. Output looks dreamy, abstract, and loosely connected to what you asked for. Colors are muted and composition is unpredictable.

Sweet spot (5-10 on SD 1.5): Good balance between prompt accuracy and natural-looking output. Details are sharp, colors are vibrant, and the composition aligns with your prompt without looking forced.

Too high (20+ on SD 1.5): The model over-commits to the prompt. Output develops harsh saturation, visible artifacts, and unnatural contrast. Fine details get burned out, edges become overly sharp, and the image loses cohesion. This effect is clearly visible in the slider above at CFG 16 and 32.

Guidance distillation and FLUX

FLUX models don't use traditional CFG. Instead, they are guidance-distilled: during training, the model learned to replicate the effect of CFG in a single forward pass, rather than computing separate conditioned and unconditioned predictions at runtime.

In traditional CFG, the model runs twice per step (one pass with your prompt, one without), then amplifies the difference. This is computationally expensive. FLUX bakes that guidance behavior into the model weights during training, so it only needs one pass per step. The CFGScale parameter on FLUX functions as a guidance embed, a numeric hint that tells the model how much of its pre-learned guidance to apply. It's not performing the actual two-prediction calculation.

In practice, this means FLUX's CFGScale has a much narrower effective range than traditional CFG. Where SD 1.5 might show dramatic changes between CFG 3 and CFG 15, FLUX produces relatively similar output across its entire range. Pushing it too high doesn't cause the same saturation artifacts, but it also doesn't give you the same level of prompt control.

True CFG Scale

For cases where you need stronger prompt adherence than distilled guidance provides, some models support trueCFGScale. This parameter forces the model to perform actual classifier-free guidance using the traditional two-pass method, bypassing the distilled shortcut.

[
  {
    "taskType": "imageInference",
    "model": "runware:101@1",
    "positivePrompt": "A crystal-clear lake surrounded by snow-capped mountains",
    "trueCFGScale": 3.5,
    "steps": 30,
    "width": 1024,
    "height": 1024
  }
]

trueCFGScale gives you real prompt control on guidance-distilled models, but it comes with tradeoffs:

  • Slower generation: The model runs two passes per step instead of one, roughly doubling inference time.
  • Lower recommended values: Models trained with distilled guidance are not optimized for high true CFG values. Start around 1.5-3.5 and increase cautiously. Values above 5-6 often degrade quality.
  • Best for prompt-heavy tasks: If you need specific text, exact compositions, or strict subject placement, trueCFGScale can enforce that where distilled guidance falls short.

Check the model's documentation or schema to confirm whether it supports trueCFGScale. Not all models expose this parameter.

Tips

  1. Start with the architecture's midpoint. For SD 1.5, try 7. For SDXL, try 6. Adjust from there based on whether the output follows your prompt closely enough.
  2. Lower CFG for abstract or artistic prompts. When your prompt describes a mood or atmosphere rather than specific objects, lower CFG values (4-6 on SD 1.5) give the model room to interpret creatively.
  3. Raise CFG for specific requirements. If your prompt includes concrete details that must appear (specific text, exact positioning, named objects), push CFG higher (8-12 on SD 1.5) to force adherence.
  4. Try trueCFGScale when distilled guidance isn't enough. If a model's default guidance produces output that's too loose, try trueCFGScale at 2-3 before increasing further.
  5. Pair CFG with seed for iteration. Fix a seed, then sweep CFG from low to high to find the value where the composition matches your intent without artifacts. This is faster than re-rolling seeds at a fixed CFG.