Advanced 32B parameter model for photorealistic image generation and editing with multi-reference support.
Overview
FLUX.2 Dev is a 32 billion parameter model developed by Black Forest Labs that unifies high-quality image generation and editing from text prompts. It supports creating new images, modifying existing ones, and combining multiple reference images (up to ten) to maintain consistent details such as characters, products, or styles. The model leverages a latent flow matching architecture integrating the Mistral-3 vision-language model and a rectified flow transformer, enabling strong real-world context understanding, accurate spatial relationships, and realistic lighting effects.
Strengths / What it does well
- Generates photorealistic images from complex, structured text prompts with accurate text rendering.
- Edits images up to 4 megapixels, allowing localized modifications while preserving unedited image parts.
- Maintains consistency across multiple reference images, ideal for product branding or character continuity.
- Produces sharper textures and stable lighting compared to previous versions.
Improvements over previous versions
- Supports multi-reference inputs up to 10 images for improved consistency.
- Enhanced legible typography rendering within generated images.
- Better adherence to complex prompt instructions.
- Increased photorealism with higher resolution editing capabilities (up to 4 MP).
- Stronger grounding in real-world knowledge and spatial logic.
Best use cases
- Product photography and branding mockups
- Character design consistency across scenes
- Infographic and user interface mockups requiring complex typography
- Photorealistic concept art and visualization
- Marketing materials blending multiple references