How do I use Alibaba Wan 2.7 Pro Text to Image via API?

Alibaba Wan 2.7 Pro Text to Image is accessible via the eachlabs unified API. Submit a text prompt and optional generation parameters; the model returns a high-quality generated image with the full capabilities of the Pro variant. Billing is pay-as-you-go through eachlabs.

How does Wan 2.7 Pro differ from the standard Wan 2.7 Text to Image?

Wan 2.7 Pro delivers higher resolution output, more refined detail, and stronger prompt fidelity than the standard Wan 2.7 variant. The standard model offers a faster, more cost-efficient option for general-purpose image generation. For production-grade commercial imagery where quality is the top priority, Wan 2.7 Pro is the recommended choice.

Alibaba Wan 2.7 Pro · Text to Image

Image·wan-2.7·by Alibaba

Alibaba Wan 2.7 Pro Text to Image is the professional-tier variant of the Wan 2.7 generation model, delivering maximum image resolution, detail richness, and prompt accuracy within the Wan 2.7 family. Designed for production-grade commercial workflows, it generates high-fidelity visuals with superior compositional control and fine texture rendering. Recommended for e-commerce product imagery, advertising creative, and any application where image quality is the top priority.

Try it now →

API reference

Runtime (p50): 30s
Estimated price: $0.075 / image

Call the API

prediction.sh

curl -X POST \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "alibaba-wan-2-7-pro-text-to-image",
    "version": "0.0.1",
    "input": {
        "thinking_mode": true,
        "n": 1,
        "size": "2K",
        "prompt": "Ultra-photorealistic cinematic scene of a long train moving through desert dunes at sunset, warm golden light, dramatic shadows on sand, subtle motion blur, expansive sky, high detail textures, no people."
    },
    "webhook_url": ""
}' \
  https://api.eachlabs.ai/v1/prediction/

Documentation8 sections

Overview
Alibaba | Wan 2.7 | Pro | Text to Image is a professional-grade text-to-image model from Alibaba's Wan family, excelling in generating and editing high-quality images up to 4K resolution with precise prompt adherence and advanced customization. Released on April 1, this unified architecture model from Alibaba Tongyi Lab breaks traditional limitations like standardized faces by enabling detailed avatar customization from bone structure to facial features, ensuring unique outputs for every prompt.

As the Pro variant of Wan 2.7-Image, it prioritizes superior visual fidelity, interactive editing, and semantic understanding over speed, making it ideal for creators needing premium results on each::labs. Access the Alibaba | Wan 2.7 | Pro | Text to Image API via each::labs for seamless integration into workflows, supporting everything from single images to coherent sets of up to 12 visuals.
Capabilities
- Text-to-image generation up to 4K resolution with flexible aspect ratios and custom sizes.
- Advanced avatar customization, adjusting bone structure, eyes, and facial features for unique faces.
- Image editing with up to 9 reference images: style transfer, element swapping, fusion, and precise selection (add/move/align in specified areas).
- Coherent image set generation up to 12 images for visual storytelling.
- Palette function: Extract/input colors and proportions from references for custom schemes.
- Superior text rendering: Ultra-long texts, tables, formulas, multilingual (12 languages) at print quality, up to 3K tokens input.
- Thinking mode for enhanced prompt reasoning and quality.
Use cases
For designers: Create customized avatars with precise facial adjustments—"Portrait of a fantasy elf: pointed ears, emerald eyes, intricate bone structure"—ideal for game assets or branding on each::labs.

For marketers: Generate image sets for campaigns using palette extraction: "12 cohesive product visuals in brand colors extracted from logo reference, evolving scenes from day to night"—enabling storytelling ads.

For developers: Integrate via Alibaba | Wan 2.7 | Pro | Text to Image API for dynamic content: Edit user-uploaded photos with "Swap background to futuristic city, align elements pixel-perfectly" for personalized apps.

For creators: Produce print-ready infographics: "A4 page with tables, math formulas in Japanese, high-fidelity rendering"—perfect for educational visuals.
Tips & tricks
Optimize prompts for Alibaba | Wan 2.7 | Pro | Text to Image by being descriptive about structure, colors, and styles, leveraging its semantic cognition for precise alignment—e.g., specify "bone structure: angular jawline, eyes: almond-shaped with green irises."

Enable thinking mode for complex scenes to improve reasoning and reduce hallucinations. For editing, use up to 9 references in a structured description: "Fuse elements from image1 (subject) and image2 (background), add logo in bottom-right." Use the palette function by referencing colors: "Extract dominant blues from reference, adjust to 60% saturation."

Example prompts: "A cyberpunk cityscape at dusk with neon reflections on wet streets, 4K detail, cinematic lighting." "Customize avatar: wide-set eyes, high cheekbones, smiling expression, in steampunk attire." "Generate 12-image set of evolving seasons on a mountain landscape."
Technical spec
- Resolution Support: Up to 4K (4096×4096) for text-to-image; flexible aspect ratios and custom dimensions available.
- Output Formats: High-quality PNG/JPG images; supports image sets up to 12 coherent outputs.
- Input Parameters: Text prompt (up to 5,000 characters); optional up to 9 reference images for editing/style transfer; size options (1K, 2K, 4K); num_outputs (1-4 or 1-12 in image set mode).
- Special Modes: Thinking mode (default for text-to-image, enhances reasoning); image_set_mode for multi-image generation.
- Processing Time: Higher due to Pro quality focus; thinking mode increases time for better results.
- Architecture: Unified generation and understanding model with semantic mapping in shared latent space.
Things to be aware of
Alibaba | Wan 2.7 | Pro | Text to Image may take longer with thinking mode or 4K outputs, so plan for extended processing on each::labs. Complex multi-image edits with 9 references can lead to inconsistencies if prompts lack structure.

Common mistakes include vague prompts causing misalignment—always specify details like positioning. Edge cases like ultra-detailed ultra-long texts may vary in perfection despite 3K token support. No local GPU needed, but high-volume use consumes credits quickly.
Key considerations
Before using Alibaba | Wan 2.7 | Pro | Text to Image, note its Pro variant emphasizes quality over the base model's speed, with 4K text-to-image exclusive to this version—ideal for professional outputs but requiring more processing time via the each::labs platform.

No local hardware needed; leverage the Alibaba | Wan 2.7 | Pro | Text to Image API for cloud-based generation. Best for detailed, customized visuals like avatars or edited compositions rather than rapid prototyping. Consider credit-based pricing on each::labs, balancing high fidelity against generation duration.
Limitations
Alibaba | Wan 2.7 | Pro | Text to Image prioritizes quality, resulting in slower speeds than the base model; 4K limited to text-to-image, not all edits.

Up to 9 input images max; num_outputs capped at 12 in set mode. While strong in customization, extremely abstract or physics-heavy scenes may show minor artifacts. Not optimized for video—focus remains image-only.

Related models

4 models

Recraft v4.1 · Text to Image AI model preview

Recraft v4.1 · Text to Imagerecraft

Nano Banana 2 Lite · Text to Image AI model preview

Nano Banana 2 Lite · Text to ImageGoogle

Recraft v4.1 · Text to Vector AI model preview

Recraft v4.1 · Text to Vectorrecraft

Recraft v4.1 Pro · Text to Vector AI model preview

Recraft v4.1 Pro · Text to Vectorrecraft

* FAQ

About Alibaba Wan 2.7 Pro · Text to Image

01 / 03

What is Alibaba Wan 2.7 Pro Text to Image?

Alibaba Wan 2.7 Pro Text to Image is the professional-tier variant of the Wan 2.7 text-to-image model, delivering maximum image quality, detail, and prompt accuracy within the Wan 2.7 family. It is optimized for commercial and production-grade image generation requiring the highest level of Wan's visual output.