WAN-2.5
Wan 2.5 Preview Image to Image transforms an input photo into a new, high-quality image while preserving the main structure and enhancing details with realistic style.
Avg Run Time: 75.000s
Model Slug: wan-2-5-preview-image-to-image
Playground
Input
Output
Example Result
Preview and download your result.

API & SDK
Create a Prediction
Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.
Get Prediction Result
Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.
Readme
Overview
Wan 2.5 Preview Image to Image is an advanced AI image generator designed to transform an input photo into a new, high-quality image while preserving the main structure and enhancing details with a realistic style. Developed by Alibaba Cloud, the model builds upon previous WAN architectures and incorporates state-of-the-art techniques for photorealistic image synthesis. It is part of the WAN 2.5 family, which also includes image-to-video and sound-to-video variants, but the image-to-image model specifically focuses on static image enhancement and transformation.
Key features of Wan 2.5 Preview Image to Image include robust structure preservation, fine detail enhancement, and realistic style transfer. The model leverages deep learning architectures optimized for high-resolution outputs and supports advanced prompt engineering for nuanced control over the generated results. Its unique capabilities stem from improved dynamic rendering, optimized GPU utilization, and enhanced compatibility with large-scale workflows, making it suitable for professional and creative applications where image quality and fidelity are paramount.
Technical Specifications
- Architecture: Advanced deep learning architecture based on WAN 2.5, likely incorporating diffusion or transformer-based components
- Parameters: Not publicly specified, but related WAN models operate in the billion-parameter range
- Resolution: Supports high-resolution outputs, typically up to 1080p; input images recommended between 360 and 2000 pixels in width/height
- Input/Output formats: Accepts JPEG, JPG, PNG (no alpha), BMP, WEBP; outputs high-quality image files in standard formats
- Performance metrics: Optimized for fast inference and efficient GPU utilization; improved memory management for large batch processing and high-resolution workflows
Key Considerations
- Ensure input images are within recommended resolution and format specifications to avoid processing errors
- For best results, use detailed and well-structured prompts that clearly describe desired enhancements or style changes
- Negative prompts can be used to exclude unwanted features or artifacts
- Quality and speed may vary depending on image complexity and hardware resources; higher resolutions may require more processing time
- Iterative refinement with prompt adjustments can significantly improve output quality
- Avoid using images with transparency (alpha channels), as these are not supported
Tips & Tricks
- Use high-quality, well-lit input images for optimal transformation results
- Structure prompts to specify both the desired style and any details to be preserved or enhanced (e.g., "preserve facial features, enhance background realism")
- Employ negative prompts to filter out common artifacts such as "low resolution, defects, worst quality"
- Experiment with prompt expansion features for more nuanced control, but review the actual prompt used if rewriting is enabled
- Adjust random seed values for reproducibility or to explore different variations
- For advanced results, combine iterative prompt refinement with batch processing to compare multiple outputs and select the best
Capabilities
- Transforms input photos into high-quality, realistic images while preserving core structure
- Enhances fine details and textures for photorealistic results
- Supports nuanced style transfer based on prompt instructions
- Handles a wide range of input formats and resolutions
- Optimized for efficient GPU usage and large-scale workflows
- Capable of batch processing for multiple images
- Robust against common image generation artifacts when negative prompts are used
What Can I Use It For?
- Professional photo enhancement and retouching for photographers and designers
- Creative style transfer for digital artists and illustrators
- Generating concept art and visual assets for games and media production
- Automated image upscaling and restoration in archival projects
- Personal projects such as transforming selfies or portraits with artistic effects
- Industry-specific applications including advertising, e-commerce product imagery, and cinematic pre-visualization
Things to Be Aware Of
- Some experimental features, such as prompt expansion, may yield unexpected results and should be reviewed for consistency
- Users report that memory usage can be significant for high-resolution or batch workflows; optimized VRAM management is recommended
- Occasional edge cases include minor artifacts or loss of detail in complex scenes, especially with ambiguous prompts
- Consistency across outputs is generally high, but random seed variation can affect reproducibility
- Positive feedback highlights the model’s photorealism, structure preservation, and versatility
- Common concerns include processing time for very large images and the need for careful prompt engineering to avoid unwanted artifacts
Limitations
- Requires substantial GPU resources for high-resolution or batch processing
- May struggle with highly abstract or ambiguous prompts, leading to less predictable results
- Not optimal for images with transparency or non-standard formats
Pricing
Pricing Type: Dynamic
Charge $0.05 per image generation
Pricing Rules
| Parameter | Rule Type | Base Price |
|---|---|---|
| num_images | Per Unit Example: num_images: 1 × $0.05 = $0.05 | $0.05 |
Related AI Models
You can seamlessly integrate advanced AI capabilities into your applications without the hassle of managing complex infrastructure.
