Text to image
The foundational pipeline: generate images from text prompts using diffusion models, with control over every step of the process.
Understand core generative AI concepts and explore model-specific guides with practical examples and code.
Deep dives into every parameter and feature of the image generation pipeline. Each page covers how a feature works, its visual impact, and practical usage patterns with interactive examples.
Step-by-step tutorials built around specific models. Each guide walks through a real use case with code, prompts, and results you can reproduce.
How to write prompts for GPT Image 2 across the cases the model handles unusually well: photorealism, accurate text, world-knowledge composition, and multi-image editing workflows.
How to use Ideogram 4.0 for typography-heavy design where the text has to be readable and exactly right, the layout has to land, and the palette has to lock to brand.
How to use Pruna P-Video-Replace to swap the on-camera character in an existing video with one from a reference image while preserving the original motion, timing, camera, lighting, and audio.
How to make localised edits to existing footage with Runway Aleph 2 that change only the targeted region and leave the rest of the clip untouched.
How to remove objects from images with FLUX Erase, Black Forest Labs's prompt-less mask-driven removal model. What you paint is what disappears.
How to extend an image past its original frame with FLUX Outpainting. No prompt, no mask, just more canvas around what is already there.
How to remove objects from images with FLUX Erase, Black Forest Labs's prompt-less mask-driven removal model. What you paint is what disappears.
How to extend an image past its original frame with FLUX Outpainting. No prompt, no mask, just more canvas around what is already there.
How to dress a person in any garment from a reference image with FLUX VTO. The call takes one person photo, one garment photo, and a short prompt.
How to compose tracks with Eleven Music v1: short prompts for quick ideation, structured composition plans when you need explicit control over the song's sections.
How to direct vocal delivery in Eleven v3 with inline audio tags. The tags carry the emotion, sound effects, and multi-speaker turn-taking that plain text alone cannot.
How to train a custom Exactly Illustrative style model on a brand's visual identity, then use it for consistent text-to-image and image-to-image generation in that look.
How to control vocal delivery in Fish Audio S2-Pro with bracket tags. The tag system steers emotion, expression, paralanguage, and phoneme-level pronunciation in one inline syntax.
How to generate two-speaker dialogue audio in a single request to Fish Audio S2-Pro using inline speaker tags. One call, two voices, full per-speaker emotion control.
How to transform game assets while preserving their structure using Canny edge detection, ControlNet, and LoRAs for consistent style across variations.
How to compose what surrounds the avatar in Avatar V output: the background, fit mode, aspect ratio for the target platform, and burned-in captions.
How to choose between Avatar V's two input modes: generate the voice from a script, or drive the avatar with your own recorded audio.
How Ideogram 4.0's two prompting modes work: natural language with Magic Prompt expansion for quick exploration, and the full trained JSON schema for explicit per-element control.
How to use Ideogram 4.0 for typography-heavy design where the text has to be readable and exactly right, the layout has to land, and the palette has to lock to brand.
How to write LLM system prompts that produce text TTS-2 can synthesize naturally, with normalization, filler words, and emphasis cues handled before the audio call.
How to use natural-language steering tags to control emotion, pacing, volume, and vocal style in TTS-2 speech output.
How to steer Krea 2 with the creativity parameter, weighted style reference images, and moodboards. Together these cover faithful renders through bold reinterpretation.
How to write prompts for GPT Image 2 across the cases the model handles unusually well: photorealism, accurate text, world-knowledge composition, and multi-image editing workflows.
How to use Pruna P-Video-Animate to bring a still reference image to life by inheriting the motion, timing, and camera move from a source video.
How to swap a single on-camera object (a product, a garment) in a source video with Pruna P-Video-Replace, without touching the rest of the frame. Bare-object reference plus directive prompt, no mask.
How to recreate iconic film scenes with Bytedance Seedance, then recast the on-camera character with Pruna P-Video-Replace to drop yourself or any reference into the shot.
How to use Pruna P-Video-Replace to swap the on-camera character in an existing video with one from a reference image while preserving the original motion, timing, camera, lighting, and audio.
How to use the settings.colors and settings.backgroundColor parameters to lock generated images to a specific color palette or background color.
How to write effective prompts for Recraft V4.1, from short interpretive prompts to structured multi-layer descriptions for precise creative control.
How to make localised edits to existing footage with Runway Aleph 2 that change only the targeted region and leave the rest of the clip untouched.
How to write Gen-4.5 image-to-video prompts that direct motion instead of redescribing the scene. Covers the camera and subject channels, naming common camera moves, and layering atmospheric motion on top.
How to generate consistent and professional sticker collections using specialized diffusion models and LoRAs.
How to use SDXL's refiner pipeline to enhance fine details and textures in the final denoising pass.
How to use scoringPrompt and scoringRubric on Sourceful Riverflow 2.5 Pro to drive different production workflows from the same brand inputs.
How to generate images with accurate, readable text using xAI Grok Imagine. Prompt for the text content, the placement, and the script you want rendered.