Flux has rapidly emerged as the leading choice for photorealistic AI image generation. Developed by Black Forest Labs, a team that includes former Stability AI researchers, Flux combines open-source accessibility with cutting-edge quality. This guide covers everything from understanding the model variants to setting up local generation and achieving stunning results.
Flux is a family of AI image generation models developed by Black Forest Labs, founded in 2024 by former key members of Stability AI's research team. The models are built on a flow-matching architecture that differs from the diffusion approach used by Stable Diffusion and DALL-E, resulting in faster generation times and exceptional quality, particularly for photorealistic human subjects.
What sets Flux apart is its remarkable ability to generate human images that look genuinely real. Skin textures, natural lighting, authentic poses, and subtle facial expressions are rendered with a quality that often surpasses Midjourney. For creators focused on realistic portraits and human subjects, Flux has become the standard.
Flux is available both through API providers for cloud-based generation and as downloadable models for local use. This flexibility means you can choose between convenience and privacy depending on your needs.
Flux 2 comes in several variants, each optimized for different use cases:
| Model | Parameters | VRAM Required | Speed | Best For |
|---|---|---|---|---|
| Flux.2 [pro] | 12B | API only | Medium | Highest quality outputs |
| Flux.2 [dev] | 12B | 24GB+ | Medium | Local high-quality generation |
| Flux.2 [schnell] | 12B | 16GB+ | Fast | Rapid iteration, previews |
| Flux.2 lite | 6B | 12GB | Very Fast | Consumer GPUs |
The flagship model, available only through APIs. It represents the absolute best quality Flux can produce, with exceptional prompt adherence and image quality. Use this when quality matters more than cost or speed.
The development version, released under an open license for non-commercial use. It's nearly as capable as the pro model and can be run locally if you have sufficient hardware. This is the go-to choice for enthusiasts and researchers.
"Schnell" means "fast" in German, and this model lives up to its name. It's optimized for speed, generating images in fewer steps while maintaining strong quality. Ideal for rapid iteration during the creative process.
A smaller model designed for consumer hardware. While it sacrifices some quality compared to the full models, it runs comfortably on GPUs with 12GB VRAM and produces results that still surpass many competitors.
The easiest way to use Flux is through API providers. Several platforms offer Flux access:
Typical costs per image (standard resolution):
Running Flux locally provides complete privacy and eliminates per-image costs. Here's how to set it up:
ComfyUI provides a node-based workflow that's perfect for Flux:
# Clone ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI
# Create virtual environment
python -m venv venv
venv\Scripts\activate # Windows
source venv/bin/activate # Linux/Mac
# Install dependencies
pip install -r requirements.txt
# Download Flux model (requires Hugging Face account)
# Place in ComfyUI/models/unet/
# Run ComfyUI
python main.py
Download the Flux model files from Hugging Face (requires accepting the license):
flux1-dev.safetensors - Main model file (~12GB)ae.safetensors - VAE encoder/decoder (~335MB)clip_l.safetensors - Text encoder (~246MB)t5xxl_fp16.safetensors - T5 text encoder (~9GB)If you have limited VRAM, use these optimizations:
Flux responds well to natural language prompts. Unlike some models that require specific syntax, you can describe what you want conversationally:
A young woman with auburn hair and green eyes, sitting in a sunlit cafe,
morning light streaming through the window, wearing a cream colored sweater,
candid portrait photography, natural lighting, shallow depth of field
ControlNet adapters for Flux allow precise control over composition:
Community LoRAs extend Flux's capabilities:
Finding LoRAs: Browse CivitAI and Hugging Face for Flux-compatible LoRAs. Look for models specifically trained on Flux, as Stable Diffusion LoRAs are not compatible.
Flux supports inpainting for selective regeneration. In ComfyUI, use the LoadImageMask node to define areas for regeneration while preserving the rest of the image.
Flux produces more photorealistic results, especially for human subjects. Midjourney has a more artistic, stylized aesthetic. Flux can run locally; Midjourney cannot. Midjourney has a gentler learning curve.
Flux generally produces better humans out of the box. Stable Diffusion has a larger ecosystem of models and tools. Both can run locally. SD has more fine-tuning options currently.
Flux produces more realistic humans. DALL-E handles text in images better. Flux can run locally; DALL-E cannot. DALL-E has stricter content policies.
One of Flux's major advantages is the option for complete local generation. When running locally:
For creators working with sensitive subjects or requiring absolute privacy, local Flux is the clear choice. See our local vs. cloud guide for more details.
Flux has earned its reputation as the photorealism leader in AI image generation. Whether you access it through APIs for convenience or run it locally for privacy, the quality of output, particularly for human subjects, is remarkable. The combination of open availability, strong community support, and continuous improvement makes Flux an essential tool for anyone serious about AI art.
As you experiment with Flux, remember that the best results come from understanding what the model does well (realistic humans, natural lighting) and playing to those strengths. Start with the techniques in this guide, explore the community resources, and you'll be creating photorealistic images in no time.
← Back to AI Image Generators Guide