How do I use Alibaba Wan 2.7 Text to Image via API?

Alibaba Wan 2.7 Text to Image is accessible via the eachlabs unified API. Submit a text prompt with optional style or resolution parameters; the model returns a generated image. Billing is pay-as-you-go through eachlabs no Alibaba account is required.

How does Wan 2.7 Text to Image compare to Wan v2.6?

Wan 2.7 introduces improvements in prompt fidelity, image sharpness, and compositional control over Wan v2.6. It handles more complex scene descriptions with greater accuracy. For new production workflows requiring the best Wan output quality, Wan 2.7 is the recommended choice; Wan v2.6 remains a stable fallback for existing integrations.

Alibaba Wan 2.7 · Text to Image

Image·wan-2.7·by Alibaba

Alibaba Wan 2.7 Text to Image is the latest generation of Alibaba's Wan image generation model, delivering significant improvements in prompt fidelity, compositional accuracy, and visual detail over previous Wan versions. It produces photorealistic and artistic images across diverse styles from natural language descriptions with stronger semantic understanding. Ideal for marketing asset generation, product concept visualization, and creative workflows requiring precise, high-quality text-to-image output.

Try it now →

API reference

Runtime (p50): 5s
Estimated price: $0.03 / image

Call the API

prediction.sh

curl -X POST \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "alibaba-wan-2-7-text-to-image",
    "version": "0.0.1",
    "input": {
        "n": 1,
        "size": "2K",
        "prompt": "Ultra-photorealistic cinematic scene of a futuristic floating island city above the clouds, waterfalls flowing into the sky below, soft golden sunlight, detailed architecture, atmospheric haze, epic wide-angle view, no people.",
        "thinking_mode": true
    },
    "webhook_url": ""
}' \
  https://api.eachlabs.ai/v1/prediction/

Documentation8 sections

Overview
Alibaba | Wan 2.7 | Text to Image is a unified model from Alibaba Tongyi Lab that excels in generating and editing high-quality images from text prompts, solving the challenge of creating detailed, customizable visuals without standardized "AI faces." Part of the Wan family, this April 2026 release introduces advanced avatar customization and superior text rendering, setting it apart with bone-level facial adjustments and print-quality output for complex elements like tables and formulas. Users benefit from its end-to-end architecture that ensures precise prompt adherence and multilingual support across 12 languages. On each::labs, access the Alibaba | Wan 2.7 | Text to Image API for seamless integration into creative workflows, delivering richly detailed images up to 4K resolution in the Pro variant.
Capabilities
- Generates high-quality text-to-image outputs up to 4K in Pro mode with flexible resolutions like 1K, 2K, or custom sizes.
- Advanced avatar customization at bone level, adjusting structure, eyes, and features for unique, non-standardized faces.
- Superior text rendering: print-quality long texts, tables, formulas, and infographics across 12 languages up to 3K tokens.
- Image editing with up to 9 references for style transfer, element swapping, or multi-image fusion.
- Palette extraction: One-click color scheme matching from references, adjustable proportions.
- Thinking mode enhances prompt interpretation for better quality and composition stability.
- Image set generation: Coherent outputs of 1-12 images in set mode.
- Unified architecture for seamless generation-to-editing transitions.
Use cases
For designers: Create custom avatars by fine-tuning bone structures—"Portrait of executive with sharp jawline, piercing blue eyes, professional attire"—leveraging bone-level adjustments for branding assets.

For marketers: Generate infographics with precise text and palettes: "A4 poster with sales table, pie chart formula, brand colors extracted from logo image"—ideal for multilingual campaigns supporting 12 languages.

For developers: Integrate via each::labs Alibaba | Wan 2.7 | Text to Image API to edit product visuals, fusing up to 9 references: "Restyle smartphone mockups by blending three angle photos with futuristic glow."

For creators: Produce coherent image sets in thinking mode: "Series of 6 fantasy landscapes evolving from dawn to dusk, matching reference color scheme"—perfect for storyboarding with stable composition.
Tips & tricks
Optimize prompts for Alibaba | Wan 2.7 | Text to Image by specifying bone structure, eye shape, and facial details for avatars, e.g., "A portrait of a young woman with high cheekbones, almond-shaped green eyes, and subtle freckles across the nose." Use the palette function by referencing colors: "Generate a landscape in the exact color scheme of a sunset photo, with 40% orange, 30% purple hues." Enable thinking mode for complex scenes to improve reasoning and adherence. For text-heavy outputs, include layout instructions: "Create an A4 page with a table of sales data, math formulas below, in Japanese and English." Combine up to 9 images in image set mode for coherent series, like "Fuse styles from three fashion photos into a new outfit design." Test with shorter prompts first to refine before scaling to 5,000 characters.
Technical spec
- Resolution Support: Up to 2K (2048×2048) for standard version; 4K (4096×4096) for Image Pro variant, with flexible aspect ratios and custom dimensions.
- Input Formats: Text prompts up to 5,000 characters; optional up to 9 reference images for editing, style transfer, or fusion.
- Output Formats: High-quality PNG/JPG images; supports num_outputs of 1-4 (or 1-12 in image set mode).
- Text Handling: Up to 3,000 tokens input, print-grade rendering for ultra-long texts, tables, math formulas, and multilingual layouts equivalent to an A4 page.
- Special Modes: Thinking mode for enhanced reasoning (default for text-to-image); image_set_mode for coherent sets.
- Architecture: Unified generation and understanding with shared latent space semantic mapping.
Things to be aware of
Alibaba | Wan 2.7 | Text to Image performs best with detailed prompts; vague inputs may lead to less precise compositions despite thinking mode. Edge cases like ultra-complex multi-element fusions with 9 images can increase processing time significantly. Common mistakes include overloading prompts beyond 5,000 characters or ignoring reference image quality, resulting in suboptimal blends. Resource needs are cloud-friendly, but local runs await open weights. Multilingual text shines in supported languages, but rare dialects may falter. Always preview single outputs before batching to catch minor alignment issues.
Key considerations
Before using Alibaba | Wan 2.7 | Text to Image, note that the Pro variant offers 4K output but prioritizes quality over speed compared to the standard 2K model. It requires detailed prompts for optimal results, especially with reference images (up to 9), making it ideal for professional editing over simple generations. Best for scenarios needing precise customization like avatars or documents, rather than rapid prototyping. On each::labs, the Alibaba | Wan 2.7 | Text to Image API handles processing efficiently, but thinking mode increases generation time—disable for faster iterations. No local GPU melting issues reported, suitable for cloud deployment.
Limitations
Alibaba | Wan 2.7 | Text to Image standard version caps at 2K resolution, with 4K exclusive to Pro. Lacks native video output, focusing solely on images despite family video capabilities. No built-in animation or motion; editing limited to static fusions. Potential inconsistencies in hyper-detailed physics or rare subjects without strong references. Open weights pending, currently cloud/API only. Does not support durations or audio, as it's image-centric.

Related models

4 models

GPT Image v2 · Text to ImageOpenAI

Microsoft MAI-Image-2.5 · Text to Image AI model preview

Microsoft MAI-Image-2.5 · Text to Imagemicrosoft

Luma Uni-1 Max · Text to Image AI model preview

Luma Uni-1 Max · Text to ImageLuma

Krea 2 Medium Turbo · Text to Image AI model preview

Krea 2 Medium Turbo · Text to ImageKrea

* FAQ

About Alibaba Wan 2.7 · Text to Image

01 / 03

What is Alibaba Wan 2.7 Text to Image?

Alibaba Wan 2.7 Text to Image is the latest generation of Alibaba's Wan text-to-image model, delivering improved prompt adherence, higher visual fidelity, and stronger compositional accuracy over previous Wan versions. It generates photorealistic and artistic images across diverse styles from natural language descriptions.