WAN-2.7
Alibaba Wan 2.7 Image Edit is the latest Wan-series image editing model by Alibaba, offering improved instruction comprehension and edit precision for a wide range of modifications including style changes, object edits, and scene alterations. Built on the Wan 2.7 architecture, it handles complex natural language edit instructions with greater semantic accuracy than earlier versions. Best suited for product photo editing, creative retouching, and high-volume commercial image transformation pipelines.
Avg Run Time: 25.000s
Model Slug: alibaba-wan-2-7-image-edit
Playground
Input
Output
Example Result
Preview and download your result.

API & SDK
Create a Prediction
Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.
Get Prediction Result
Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.
Readme
Overview
Alibaba | Wan 2.7 | Image Edit empowers users to transform existing images through precise text-guided instructions, solving the challenge of flexible and creative image manipulation without starting from scratch. Developed by Alibaba Tongyi Lab as part of the Wan 2.7 family, this image-to-image model leverages a unified architecture for generation and editing, enabling seamless adjustments like style transfers, element swaps, and multi-image fusions. Its primary differentiator is support for up to nine reference images in a single prompt, allowing complex edits such as blending multiple sources into cohesive outputs at resolutions up to 4K in the Pro variant. Available via the Alibaba | Wan 2.7 | Image Edit API on platforms like each::labs, it excels in realistic avatar customization and superior text rendering, making it ideal for designers and creators seeking professional-grade results.
Technical Specifications
- Resolution Support: Up to 2K (2048×2048) for standard version; 4K (4096×4096) for Image Pro variant, with flexible aspect ratios and custom dimensions.
- Input Formats: Text prompts up to 5,000 characters; up to 9 reference images for editing, style transfer, or fusion; optional image set mode for up to 12 coherent outputs.
- Output Formats: High-quality PNG/JPEG images; supports num_outputs of 1-4 (or 1-12 in image set mode).
- Processing Time: Faster in standard mode; thinking mode (default for text-to-image) enhances quality but increases generation time.
- Architecture: Unified Diffusion Transformer with T5 encoder and MoE routing; semantic mapping in shared latent space for precise comprehension.
- Additional Features: 12-language support, up to 3K tokens for text rendering (tables, formulas, A4-page equivalents at print quality).
Key Considerations
Before using Alibaba | Wan 2.7 | Image Edit, ensure access to the Alibaba | Wan 2.7 | Image Edit API via each::labs, as it requires an input image and descriptive prompt for optimal results. This Alibaba image-to-image model shines in scenarios needing multi-reference editing, outperforming basic tools for complex fusions but may demand more computational resources in 4K Pro mode. Consider cost tradeoffs: standard version offers speed for quick iterations, while Pro prioritizes quality. Best for professional workflows over simple text-to-image, with prerequisites like high-quality reference images to avoid artifacts.
Tips & Tricks
Optimize prompts for Alibaba | Wan 2.7 | Image Edit by being specific about changes, e.g., "Replace the background with a sunset over mountains, keeping the subject's pose and lighting intact." Enable thinking mode for intricate edits to improve reasoning and output fidelity, though it extends processing time—ideal for final renders. Use up to nine images strategically: position key references in a 3x3 grid for balanced fusions, specifying "Blend image 1's face onto image 2's body with image 3's clothing style." For avatar customization, detail bone structure adjustments like "Enlarge eyes and refine jawline for a youthful anime look." Experiment with palette extraction: "Extract colors from reference and apply to a cyberpunk cityscape." Workflow tip: Generate sets in image_set_mode for consistent series, then refine individually.
Example prompts:
- "Edit the portrait: customize bone structure for sharper cheekbones, add golden hour lighting from reference image 2."
- "Fuse nine landscapes into one panoramic view, emphasizing vibrant palettes from all inputs."
- "Restyle product photo to minimalist white background with floating elements, precise text overlay: 'Premium Tech 2026'."
Capabilities
- Text-guided image editing with up to 9 reference images for style transfer, element swapping, and multi-image blending.
- Realistic avatar customization at bone-level precision, avoiding generic "AI faces" for unique facial features.
- Superior text rendering: ultra-long texts, tables, formulas, and multilingual layouts (12 languages) at print quality, up to 3K tokens and A4-page outputs.
- Palette function for one-click color extraction and customization from references, enabling matched color schemes.
- High-resolution generation up to 4K (Pro), with thinking mode for enhanced prompt reasoning and quality.
- Group image generation: up to 12 coherent images from max 9 references in image set mode.
- Semantic understanding via unified architecture for precise instruction following without pixel guessing.
What Can I Use It For?
For designers: Customize client avatars by fine-tuning bone structures and eyes from reference selfies—"Adjust facial bones for a heroic profile, blend with fantasy armor from image 3"—producing unique portraits beyond standard templates.
For marketers: Restyle product shots with palette matching—"Extract brand colors from logo image, apply to tech gadget on neon background"—creating campaign visuals with precise color control and text overlays like tables of specs.
For developers: Generate UI mockup sets via image set mode—"Fuse 9 wireframe screenshots into polished app interfaces with multilingual labels"—leveraging up to 12 outputs for rapid prototyping with the Alibaba | Wan 2.7 | Image Edit API.
For creators: Edit multi-reference scenes for digital art—"Swap elements across 9 landscapes to build a surreal dreamscape"—using 4K Pro for high-fidelity prints or social media assets.
Things to Be Aware Of
Users often overlook reference image quality in Alibaba | Wan 2.7 | Image Edit, leading to artifacts; always use high-res inputs for best fusions. Thinking mode boosts quality but doubles processing time—reserve for complex prompts. Edge cases like heavy occlusions in multi-image blends may cause minor inconsistencies, especially at max 9 references. Common mistake: vague prompts without spatial instructions, resulting in misaligned edits; specify positions explicitly. Resource needs scale with resolution—4K Pro demands robust hardware or cloud API like each::labs.
Limitations
Alibaba | Wan 2.7 | Image Edit caps at 9 input images and lacks native video editing in the image variant, focusing solely on static outputs. Standard version limits to 2K; 4K text-to-image requires Pro. Complex physics or motion simulations falter without video extensions, and extreme stylizations may soften details. No open weights yet—cloud API dependent; multilingual text excels but niche scripts unconfirmed beyond 12 languages.
Pricing
Pricing Type: Dynamic
0.03/Per image pricing
Current Pricing
Related AI Models
You can seamlessly integrate advanced AI capabilities into your applications without the hassle of managing complex infrastructure.
Dev questions, real answers.
Alibaba Wan 2.7 Image Edit is the latest iteration of Alibaba's Wan image editing model that applies instruction-guided modifications to existing images. It delivers improved semantic understanding and edit precision over previous Wan versions, supporting a wide range of changes including style updates, object edits, and scene modifications.
Alibaba Wan 2.7 Image Edit is available through the eachlabs unified API. Submit a source image and a text instruction describing the desired edit; the model returns a modified image. Billing is pay-as-you-go through eachlabs no Alibaba account is required.
Alibaba Wan 2.7 Image Edit is best suited for product photo editing, creative image transformation, and marketing asset modification requiring high semantic accuracy. It is particularly effective for e-commerce and design workflows where complex, multi-element edit instructions need to be interpreted precisely at scale.
