/ docs_

Learn

Understand core generative AI concepts and explore model-specific guides with practical examples and code.

Concepts

Deep dives into every parameter and feature of the image generation pipeline. Each page covers how a feature works, its visual impact, and practical usage patterns with interactive examples.

Workflows

Text to image

The foundational pipeline: generate images from text prompts using diffusion models, with control over every step of the process.

Image to image

Transform existing images using a source image as a starting point, controlling how much of the original to preserve.

Inpainting

Selectively edit specific regions of an image using masks, replacing or regenerating content while preserving the rest.

Outpainting

Extend images beyond their original borders in any direction while maintaining seamless continuity.

Generation Settings

Prompts

How positive and negative prompts steer the generation process, with architecture-specific behavior and practical techniques.

Steps

How the steps parameter controls the iterative refinement process, with architecture-specific ranges and speed vs quality tradeoffs.

Dimensions

How width and height affect generation quality, with architecture-specific limits and aspect ratio guidance.

Schedulers

The denoising algorithms that guide diffusion models from noise to image, each with different speed and quality tradeoffs.

Seed

Pin the random noise to reproduce identical outputs or create controlled variations from the same starting point.

CFG Scale

Controls how strictly the model follows your prompt, balancing creativity with adherence.

CLIP Skip

Adjusts which text encoder layer interprets your prompt, shifting between literal and abstract output.

VAE

The decoder that converts the model's latent representation into the final image, affecting color fidelity and detail.

Enhancements

ControlNet

Provides structural control over generation using conditioning inputs like edge maps, depth maps, and pose detection.

LoRAs

Lightweight model adapters that add specific styles, subjects, or concepts without retraining the base model.

IP Adapters

Use reference images to guide generation, enabling style transfer and visual consistency across outputs.

Embeddings

Encode custom visual concepts, styles, or subjects into specialized tokens for consistent generation.

Guides

Step-by-step tutorials built around specific models. Each guide walks through a real use case with code, prompts, and results you can reproduce.

Featured

Gemini Omni Flash

Reference-driven video with Gemini Omni Flash

How to use Gemini Omni Flash's reference image workflow to lock a visual style, hold a character across scenes, or guide a video through storyboard key beats.

GPT Image 2

Prompting GPT Image 2

How to write prompts for GPT Image 2 across the cases the model handles unusually well: photorealism, accurate text, world-knowledge composition, and multi-image editing workflows.

Ideogram 4.0

Text and design output

How to use Ideogram 4.0 for typography-heavy design where the text has to be readable and exactly right, the layout has to land, and the palette has to lock to brand.

Nano Banana 2

Keeping characters and products consistent

How to use Nano Banana 2 reference images to keep the same character or product identical across new scenes and styles.

Gemini Omni Flash

Cinematic prompting for Gemini Omni Flash

How to prompt Gemini Omni Flash for cinematic video using Google's five-element structure, camera language, and the less-prescriptive sweet spot.

Kling VIDEO 3.0 Turbo

Multi-shot reels with the Kling 3.0 Turbo shot template

How to use Kling 3.0 Turbo's inline shot-list syntax to direct multi-shot video reels in a single API call, with shot-by-shot timing and prompt control.