Eachlabs | AI Workflows for app builders
alibaba-wan-2.7-pro-text-to-image

WAN-2.7

Alibaba Wan 2.7 Pro Text to Image is the professional-tier variant of the Wan 2.7 generation model, delivering maximum image resolution, detail richness, and prompt accuracy within the Wan 2.7 family. Designed for production-grade commercial workflows, it generates high-fidelity visuals with superior compositional control and fine texture rendering. Recommended for e-commerce product imagery, advertising creative, and any application where image quality is the top priority.

Avg Run Time: 50.000s

Model Slug: alibaba-wan-2-7-pro-text-to-image

Playground

Input

Output

Example Result

Preview and download your result.

Preview
0.075/Per image pricing

API & SDK

Create a Prediction

Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.

Get Prediction Result

Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.

Readme

Table of Contents
Overview
Technical Specifications
Key Considerations
Tips & Tricks
Capabilities
What Can I Use It For?
Things to Be Aware Of
Limitations

Overview

Alibaba | Wan 2.7 | Pro | Text to Image is a professional-grade text-to-image model from Alibaba's Wan family, excelling in generating and editing high-quality images up to 4K resolution with precise prompt adherence and advanced customization. Released on April 1, this unified architecture model from Alibaba Tongyi Lab breaks traditional limitations like standardized faces by enabling detailed avatar customization from bone structure to facial features, ensuring unique outputs for every prompt.

As the Pro variant of Wan 2.7-Image, it prioritizes superior visual fidelity, interactive editing, and semantic understanding over speed, making it ideal for creators needing premium results on each::labs. Access the Alibaba | Wan 2.7 | Pro | Text to Image API via each::labs for seamless integration into workflows, supporting everything from single images to coherent sets of up to 12 visuals.

Technical Specifications

  • Resolution Support: Up to 4K (4096×4096) for text-to-image; flexible aspect ratios and custom dimensions available.
  • Output Formats: High-quality PNG/JPG images; supports image sets up to 12 coherent outputs.
  • Input Parameters: Text prompt (up to 5,000 characters); optional up to 9 reference images for editing/style transfer; size options (1K, 2K, 4K); num_outputs (1-4 or 1-12 in image set mode).
  • Special Modes: Thinking mode (default for text-to-image, enhances reasoning); image_set_mode for multi-image generation.
  • Processing Time: Higher due to Pro quality focus; thinking mode increases time for better results.
  • Architecture: Unified generation and understanding model with semantic mapping in shared latent space.

Key Considerations

Before using Alibaba | Wan 2.7 | Pro | Text to Image, note its Pro variant emphasizes quality over the base model's speed, with 4K text-to-image exclusive to this version—ideal for professional outputs but requiring more processing time via the each::labs platform.

No local hardware needed; leverage the Alibaba | Wan 2.7 | Pro | Text to Image API for cloud-based generation. Best for detailed, customized visuals like avatars or edited compositions rather than rapid prototyping. Consider credit-based pricing on each::labs, balancing high fidelity against generation duration.

Tips & Tricks

Optimize prompts for Alibaba | Wan 2.7 | Pro | Text to Image by being descriptive about structure, colors, and styles, leveraging its semantic cognition for precise alignment—e.g., specify "bone structure: angular jawline, eyes: almond-shaped with green irises."

Enable thinking mode for complex scenes to improve reasoning and reduce hallucinations. For editing, use up to 9 references in a structured description: "Fuse elements from image1 (subject) and image2 (background), add logo in bottom-right." Use the palette function by referencing colors: "Extract dominant blues from reference, adjust to 60% saturation."

Example prompts: "A cyberpunk cityscape at dusk with neon reflections on wet streets, 4K detail, cinematic lighting." "Customize avatar: wide-set eyes, high cheekbones, smiling expression, in steampunk attire." "Generate 12-image set of evolving seasons on a mountain landscape."

Capabilities

  • Text-to-image generation up to 4K resolution with flexible aspect ratios and custom sizes.
  • Advanced avatar customization, adjusting bone structure, eyes, and facial features for unique faces.
  • Image editing with up to 9 reference images: style transfer, element swapping, fusion, and precise selection (add/move/align in specified areas).
  • Coherent image set generation up to 12 images for visual storytelling.
  • Palette function: Extract/input colors and proportions from references for custom schemes.
  • Superior text rendering: Ultra-long texts, tables, formulas, multilingual (12 languages) at print quality, up to 3K tokens input.
  • Thinking mode for enhanced prompt reasoning and quality.

What Can I Use It For?

For designers: Create customized avatars with precise facial adjustments—"Portrait of a fantasy elf: pointed ears, emerald eyes, intricate bone structure"—ideal for game assets or branding on each::labs.

For marketers: Generate image sets for campaigns using palette extraction: "12 cohesive product visuals in brand colors extracted from logo reference, evolving scenes from day to night"—enabling storytelling ads.

For developers: Integrate via Alibaba | Wan 2.7 | Pro | Text to Image API for dynamic content: Edit user-uploaded photos with "Swap background to futuristic city, align elements pixel-perfectly" for personalized apps.

For creators: Produce print-ready infographics: "A4 page with tables, math formulas in Japanese, high-fidelity rendering"—perfect for educational visuals.

Things to Be Aware Of

Alibaba | Wan 2.7 | Pro | Text to Image may take longer with thinking mode or 4K outputs, so plan for extended processing on each::labs. Complex multi-image edits with 9 references can lead to inconsistencies if prompts lack structure.

Common mistakes include vague prompts causing misalignment—always specify details like positioning. Edge cases like ultra-detailed ultra-long texts may vary in perfection despite 3K token support. No local GPU needed, but high-volume use consumes credits quickly.

Limitations

Alibaba | Wan 2.7 | Pro | Text to Image prioritizes quality, resulting in slower speeds than the base model; 4K limited to text-to-image, not all edits.

Up to 9 input images max; num_outputs capped at 12 in set mode. While strong in customization, extremely abstract or physics-heavy scenes may show minor artifacts. Not optimized for video—focus remains image-only.

Pricing

Pricing Type: Dynamic

0.075/Per image pricing

Current Pricing

0.075/Per image pricing
FREQUENTLY ASKED QUESTIONS

Dev questions, real answers.

Alibaba Wan 2.7 Pro Text to Image is the professional-tier variant of the Wan 2.7 text-to-image model, delivering maximum image quality, detail, and prompt accuracy within the Wan 2.7 family. It is optimized for commercial and production-grade image generation requiring the highest level of Wan's visual output.

Alibaba Wan 2.7 Pro Text to Image is accessible via the eachlabs unified API. Submit a text prompt and optional generation parameters; the model returns a high-quality generated image with the full capabilities of the Pro variant. Billing is pay-as-you-go through eachlabs.

Wan 2.7 Pro delivers higher resolution output, more refined detail, and stronger prompt fidelity than the standard Wan 2.7 variant. The standard model offers a faster, more cost-efficient option for general-purpose image generation. For production-grade commercial imagery where quality is the top priority, Wan 2.7 Pro is the recommended choice.