Text to image Text to image
The foundational pipeline: generate images from text prompts using diffusion models, with control over every step of the process.
Understand core generative AI concepts and explore model-specific guides with practical examples and code.
Deep dives into every parameter and feature of the image generation pipeline. Each page covers how a feature works, its visual impact, and practical usage patterns with interactive examples.
Step-by-step tutorials built around specific models. Each guide walks through a real use case with code, prompts, and results you can reproduce.
Learn how to transform game assets while preserving their structure using Canny edge detection, ControlNet, and specialized LoRAs to maintain consistent style across multiple variations.
How to write effective prompts for Recraft V4.1, from short interpretive prompts to structured multi-layer descriptions for precise creative control.
Learn how to generate consistent and professional sticker collections using specialized diffusion models and LoRAs.
How to get the most out of GPT Image 2. Covers prompt format tricks, photorealism, text rendering, infographics, world knowledge, ad creatives, and multi-image workflows for editing, style transfer, character consistency, and compositing.
How to use FLUX Erase, a prompt-less mask-driven object removal model from Black Forest Labs. Covers the request shape, masking strategy, dilation tuning, and practical removal patterns.
How to use Eleven v3's audio tag system. Covers emotion, sound-effect, and experimental tags, voice-character constraints, punctuation and capitalization, and multi-speaker dialogue conventions.
How to use FLUX Erase, a prompt-less mask-driven object removal model from Black Forest Labs. Covers the request shape, masking strategy, dilation tuning, and practical removal patterns.
How to use FLUX Outpainting, a prompt-less image-extension model from Black Forest Labs. Covers the request shape, when prompt-less extension is the right tool, three practical extension patterns, and sizing limits.
How to generate music with Eleven Music v1. Covers the two input modes (simple prompt vs composition plan), prompt vocabulary, section structure, lyrics, negative styles, and the instrumental flag.
How to use Eleven v3's audio tag system. Covers emotion, sound-effect, and experimental tags, voice-character constraints, punctuation and capitalization, and multi-speaker dialogue conventions.
How to train a custom Exactly Illustrative model on a brand's visual style. Covers dataset curation, the training API call, status polling, and using the trained model for consistent text-to-image and image-to-image generation.
Learn how to transform game assets while preserving their structure using Canny edge detection, ControlNet, and specialized LoRAs to maintain consistent style across multiple variations.
How to control everything around the avatar: background removal, solid and image backgrounds, fit modes, aspect ratios for different platforms, and captions.
How to choose between the TTS path and the audio-input path when generating Avatar V videos. Covers avatar selection, voice swapping, speed tuning, and multilingual delivery from a single script.
How to write system prompts that make LLM output sound natural when synthesized by TTS-2. Covers text normalization, filler words, emphasis, and ready-to-use prompt templates.
How to use natural-language steering tags to control emotion, pacing, volume, and vocal style in TTS-2 speech output.
How to get the most out of GPT Image 2. Covers prompt format tricks, photorealism, text rendering, infographics, world knowledge, ad creatives, and multi-image workflows for editing, style transfer, character consistency, and compositing.
How to use the settings.colors and settings.backgroundColor parameters to lock generated images to a specific color palette or background color.
How to write effective prompts for Recraft V4.1, from short interpretive prompts to structured multi-layer descriptions for precise creative control.
Learn how to generate consistent and professional sticker collections using specialized diffusion models and LoRAs.
How to generate images with accurate, readable text. Covers prompt techniques for text placement, multilingual rendering, and practical use cases.