How do I use Wan 2.2 Image to Video via API?

Wan 2.2 Image to Video is available through the eachlabs unified API. Provide a source image with optional motion or style parameters; the model returns an animated video. Billing is pay-as-you-go through eachlabs no Alibaba account is required.

What is Wan 2.2 Image to Video best suited for?

Wan 2.2 Image to Video is best suited for social media content animation, e-commerce product video creation, and marketing asset generation. It is particularly effective for converting existing image libraries into engaging short-form video content at scale.

inference · 78.0s

Example inputhover

prompt: "A surreal, dreamlike forest glowing with bioluminescent plants and floating lights, soft mist flowing through ancient trees, glowing mushrooms along a winding path, magical atmosphere, vibrant colors, depth of field, hyper-realistic, cinematic lighting, fantasy environment, trending on ArtStation"
negative_prompt: "low quality, blurry, dull colors, flat lighting, cartoon, text, watermark, distorted, overexposed, out of focus, cropped, poorly drawn"
resolution: "720p"
aspect_ratio: "auto"
enable_safety_checker: false
image_url

Wan 2.2 · Image to Video

Video·wan-v2.2·by Alibaba

Bring your static images to life with the advanced physics engine of wan-2-2-i2v; create fluid videos with high motion consistency while preserving object integrity.

Try it now →

API reference

Runtime (p50): 1m
Estimated price: From $0.02

Call the API

prediction.sh

curl -X POST \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "wan-2-2-i2v",
    "version": "0.0.1",
    "input": {
        "prompt": "A surreal, dreamlike forest glowing with bioluminescent plants and floating lights, soft mist flowing through ancient trees, glowing mushrooms along a winding path, magical atmosphere, vibrant colors, depth of field, hyper-realistic, cinematic lighting, fantasy environment, trending on ArtStation",
        "negative_prompt": "low quality, blurry, dull colors, flat lighting, cartoon, text, watermark, distorted, overexposed, out of focus, cropped, poorly drawn",
        "resolution": "720p",
        "aspect_ratio": "auto",
        "enable_safety_checker": false,
        "image_url": "https://storage.googleapis.com/magicpoint/models/wan-2.2-default-image.jpg"
    },
    "webhook_url": ""
}' \
  https://api.eachlabs.ai/v1/prediction/

Documentation8 sections

Overview
wan-2-2-i2v — Image-to-Video AI Model

Developed by Alibaba as part of the wan-v2.2 family, wan-2-2-i2v transforms static images into dynamic 5-second videos at up to 1080p resolution, leveraging an advanced physics engine for fluid motion and object preservation. This Alibaba image-to-video model excels in image-to-video AI tasks by accepting text prompts alongside input images to guide realistic animations without audio, solving the challenge of bringing photos to life with high consistency. Users searching for "best image-to-video AI model" appreciate its 50% speed improvement over prior versions, delivering MP4 outputs at 30 fps in resolutions like 480p, 720p, and 1080p.
Capabilities
- Generates high-quality, realistic videos from single static images
- Supports both 480p and 720p output resolutions
- Maintains temporal coherence and visual consistency across frames
- Adaptable to a wide range of visual styles and subject matter based on prompt input
- Efficient inference enabled by the MoE architecture, with only one expert active per step
- Capable of running on consumer-grade GPUs with sufficient VRAM
Use cases
Use Cases for wan-2-2-i2v

Content creators animating product photos for e-commerce can upload a static image of a watch with the prompt "the watch hands smoothly rotate on a luxury velvet background with subtle lighting shifts," producing a 5-second 1080p loop that highlights details without distortion—ideal for "AI image to video generator" needs.

Marketers building social media teasers use wan-2-2-i2v to turn lifestyle shots into engaging clips, feeding an image plus "gentle waves lapping at a beach sunset with palm leaves swaying," ensuring motion fidelity for ads that capture attention better than static posts.

Developers integrating wan-2-2-i2v API into apps for personalized previews process user-uploaded portraits with prompts like "add a soft smile and head tilt in natural lighting," generating consistent animations for avatar tools or virtual try-ons.

Designers prototyping UI mockups animate static wireframes, inputting an app screenshot and "buttons pulse gently with icons sliding into place," to create demo videos that showcase interactions with precise object preservation.
Tips & tricks
How to Use wan-2-2-i2v on Eachlabs

Access wan-2-2-i2v seamlessly through Eachlabs Playground for instant testing, API for production-scale image-to-video AI model deployments, or SDKs for custom integrations. Upload a reference image and text prompt specifying motion like "animate with realistic physics," select resolution (480p-1080p) and 5s duration, then receive high-fidelity MP4 output at 30 fps with preserved details.
---
Technical spec
What Sets wan-2-2-i2v Apart

wan-2-2-i2v stands out in the competitive image-to-video AI model landscape with its focus on speed and stability upgrades specific to the wan-v2.2 family. It processes inputs 50% faster than wan 2.1 models, enabling quick generation of 5-second clips ideal for developers needing efficient wan-2-2-i2v API integrations. This speed allows real-time prototyping without compromising on 1080p output quality at 30 fps in MP4 (H.264) format.
- Multi-resolution support up to 1080p: Generates videos in 480p, 720p, or 1080p from a single image plus text, preserving fine details in high-res outputs that many open-source alternatives limit to 720p. This enables crisp animations for professional previews without upscaling artifacts.
- Enhanced stability over wan 2.1: Comprehensive improvements in motion consistency and success rates ensure objects maintain integrity during animation. Users benefit from reliable physics-based movements, reducing failed generations common in earlier models.
- Flash variant efficiency: The wan2.2-i2v-flash option prioritizes rapid inference on standard hardware, supporting text-image inputs for 5s durations. This makes it perfect for high-volume Alibaba image-to-video workflows like batch processing product shots.
Technical specs include 5-second video duration, 30 fps frame rate, and no audio output, with average processing optimized for cloud APIs.
Things to be aware of
- Some experimental features may behave unpredictably, especially with highly abstract or unconventional prompts
- Users have noted occasional artifacts or inconsistencies in complex scenes with rapid motion or intricate backgrounds
- Performance is heavily dependent on GPU resources; lower VRAM may limit resolution or increase inference time
- Consistency across multiple runs is generally high, but minor variations can occur due to the stochastic nature of diffusion models
- Positive feedback highlights the model’s ability to generate detailed, visually appealing videos with minimal manual tuning
- Some users report that prompt specificity greatly influences output quality, emphasizing the importance of prompt engineering
- Negative feedback patterns include occasional frame flicker or loss of detail in challenging scenarios
Key considerations
- The model requires a GPU with at least 80GB VRAM for optimal performance at higher resolutions
- For best results, ensure the aspect ratio of the input image matches the desired output video
- The MoE architecture automatically manages expert switching based on signal-to-noise ratio, so manual tuning is minimal
- Prompt engineering is important: detailed and context-rich prompts yield more accurate and visually appealing videos
- Quality and speed are balanced by the MoE design, but higher resolutions and longer videos will increase inference time
- Avoid using low-quality or ambiguous input images, as these can degrade output quality
Limitations
- Requires high-end GPU hardware (80GB VRAM recommended) for full-resolution video generation
- May struggle with highly complex scenes, rapid motion, or ambiguous input images
- Not optimal for real-time or low-latency applications due to computational demands

Related models

4 models

Veo 3.1 Lite · First Last Frame to VideoGoogle

Kling o3 4K · Image to Video AI model preview

Kling o3 4K · Image to VideoKling

Alibaba Wan 2.7 · Image to VideoAlibaba

PixVerse C1 TransitionPixverse

* FAQ

About Wan 2.2 · Image to Video

01 / 03

What is Wan 2.2 Image to Video?

Wan 2.2 Image to Video is an AI image animation model by Alibaba that generates smooth, high-quality video from still image inputs. Built on the Wan 2.2 architecture, it applies realistic motion dynamics and temporal coherence to produce polished video output from a single source image.

Wan 2.2 · Image to Video

wan-2-2-i2v — Image-to-Video AI Model

Use Cases for wan-2-2-i2v

How to Use wan-2-2-i2v on Eachlabs

What Sets wan-2-2-i2v Apart

Related models

About Wan 2.2 · Image to Video

What is Wan 2.2 Image to Video?