Edit images with text prompts. Supports precise text editing and semantic/appearance modifications.
Model Overview
A powerful image editing model that leverages Qwen-Image's text rendering capabilities and integrates visual semantic and appearance control for precise image modifications.
Best At
- Precise text editing within images (bilingual support).
- Semantic editing: Modifying image content while maintaining visual semantics (e.g., object rotation, style transfer, IP creation).
- Appearance editing: Adding, removing, or modifying specific elements while keeping other regions unchanged (e.g., adding objects, reflections, changing backgrounds or clothing).
- Transforming images into different artistic styles.
Limitations / Not Good At
- While it excels at precise edits, complex or highly abstract editing requests might require iterative refinement.
- The model's performance on extremely low-resolution or heavily artifacted input images may vary.
Ideal Use Cases
- Adding or modifying text on product mockups, posters, or merchandise.
- Creating custom illustrations for blog posts or social media by altering existing images.
- Generating variations of existing artwork or photos with different styles or elements.
- Virtual avatar creation and character design.
- Rapid prototyping of visual content requiring specific text overlays.
Input & Output Format
- Input: Image file (JPEG, PNG, GIF, WEBP) and a text prompt describing the desired edits. Optional parameters include aspect ratio, speed optimization, seed, output format, output quality, and safety checker disabling.
- Output: An array of URIs pointing to the edited image files.
Performance Notes
- The
go_fast parameter allows for faster predictions with additional optimizations.
output_quality influences the quality of non-PNG outputs (0-100).