WAN-2.7

Wan 2.7 Reference-to-Video generates videos with consistent character and object appearance from a reference image, supporting single or multi-shot scenes and optional motion guidance from video references.

Avg Run Time: 500.000s

Model Slug: alibaba-wan-2-7-reference-to-video

Release Date: April 3, 2026

Input

Prompt*

Reference Image*

Enter a URL or choose a file from your computer.

Invalid URL.

(Max 50MB)

Reference Video

Enter a URL or choose a file from your computer.

Click to upload or drag and drop

(Max 50MB)

First Frame

Enter a URL or choose a file from your computer.

Click to upload or drag and drop

(Max 50MB)

Reference Voice

Enter a URL or choose a file from your computer.

Click to upload or drag and drop

(Max 50MB)

Negative Prompt

Resolution

Aspect Ratio

Duration

Shot Type

Prompt Extend

Seed

Output

Example Result

Preview and download your result.

1080P pricing: $0.15/sec (default)

Pricing Type: Dynamic

1080P pricing: $0.15/sec (default)

Current Pricing

1080P pricing: $0.15/sec (default)

Estimated cost: $0.7500

Pricing Rules

Condition	Pricing
`resolution matches "720P"`	720P pricing: $0.10/sec
`Rule 2`(Active)	1080P pricing: $0.15/sec (default)

AI TRENDS

Related AI Models

You can seamlessly integrate advanced AI capabilities into your applications without the hassle of managing complex infrastructure.

Reference to Video

An advanced video generation model delivering cinematic visuals with native audio, realistic physics, and director-level camera control, supporting text, image, audio, and video inputs.

Bytedance | Seedance 2.0 | Reference to Video

200 s

Reference to Video

Transforms images, elements, and text into cohesive, high-quality video scenes while preserving character identity, object detail, and environmental consistency.

Kling | o3 | Pro | Referance to Video

20 s

Reference to Video

Analyze the style and structure of a video you admire with veo3-1-reference-to-video and replicate its visual language and motion structure in your new videos.

Veo 3.1 | Reference to Video

100 s

Reference to Video

PixVerse C1 Fusion composes videos from multiple reference images by combining subjects and environments into a single cohesive scene, supporting structured prompts, multi-image storytelling, and synchronized audio with smooth visual consistency.