Alibaba Wan 2.7 · Video Edit

Video·wan-2.7·by Alibaba

Wan 2.7 Video Edit applies instruction-based edits, reference image-based edits, or style transfer to existing videos. Supports 720P/1080P, preserves or regenerates audio, and handles 2-10s input videos.

Runtime (p50)
5m
Estimated price
From $0.1
Call the API
prediction.sh
sh
curl -X POST \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "alibaba-wan-2-7-video-edit",
    "version": "0.0.1",
    "input": {
        "prompt": "Transform the entire scene into a detailed black-and-white pencil sketch. Use fine linework, soft shading, and cross-hatching techniques. Emphasize light and shadow with smooth gradients and delicate strokes. Add subtle paper texture and sketch imperfections for a natural hand-drawn look. The composition should feel artistic and expressive, like a realistic graphite drawing on textured paper.",
        "video_url": "https://storage.googleapis.com/magicpoint/inputs/alibaba-wan-2-7-video-edit-input.mp4",
        "resolution": "1080P",
        "audio_setting": "auto"
    },
    "webhook_url": ""
}' \
  https://api.eachlabs.ai/v1/prediction/
Documentation8 sections
  • Overview

    Alibaba | Wan 2.7 | Video Edit transforms existing videos through instruction-based edits, reference image guidance, or style transfers, solving the challenge of precise video manipulation without full regeneration. Part of Alibaba's advanced Wan 2.7 family, this video-to-video model excels in temporal feature transfer, preserving motion dynamics, camera work, and visual effects from source videos. Its standout capability lies in supporting native 1080p output for 2-10 second inputs with multi-reference handling up to 5 simultaneous inputs, enabling complex multi-subject compositions. Available via the Alibaba | Wan 2.7 | Video Edit API on platforms like each::labs, it streamlines workflows for creators needing high-fidelity Alibaba video-to-video edits. Ideal for professional video refinement, it maintains audio synchronization and handles real human references seamlessly.

  • Capabilities
    • Instruction-based video editing via natural language prompts for object replacement, scene alteration, or enhancements.
    • Reference-based edits supporting up to 5 simultaneous image/video/audio inputs for multi-subject consistency.
    • Temporal feature transfer to preserve motion dynamics, camera movements, and effects from source videos.
    • Native 1080p output for 2-10s inputs, with audio preservation or regeneration.
    • Style transfer applying visual aesthetics from references while maintaining original timing.
    • Real human image/video references as first frames or subjects, ensuring natural appearance and motion.
    • Joint subject+voice control via mixed media inputs for synchronized edits.
  • Use cases

    Content Creators: Refine raw footage by instruction-based object swaps, e.g., "remove the logo from the product demo video, keep hand movements natural." Leverages temporal transfer for seamless pro results.

    Marketers: Perform style transfers on promo clips, like "apply luxury gold tones from reference image to car ad video." Multi-reference support ensures brand consistency across subjects.

    Video Designers: Edit social media reels with face swaps using real human references: "replace presenter's face with actor image, sync to original speech audio." Preserves 1080p quality for platforms.

    Developers: Integrate via Alibaba | Wan 2.7 | Video Edit API for app-based Alibaba video-to-video tools, automating multi-subject scene edits with 5-reference inputs for dynamic content generation.

  • Tips & tricks

    Optimize prompts for Alibaba | Wan 2.7 | Video Edit by being specific about temporal changes, like "replace the background with a sunset while keeping the subject's walking motion identical." Use multi-references strategically: combine image for subject appearance, video for motion, and audio for voice sync. Enable first/last frame control for seamless transitions in edits. For style transfer, reference high-quality sources to maintain 1080p fidelity.

    Example prompts:

    • "Edit the video to change the man's shirt to red, preserve original walking path and camera pan."
    • "Apply cyberpunk style to this cityscape video, transfer neon lighting effects temporally."
    • "Replace actor's face with reference image, sync lip movements to original audio."

    Workflow tip: Test with single references first, then scale to 5 for complex scenes on each::labs.

  • Technical spec
    • Resolution Support: Native 1080p across all editing modes, with flexible aspect ratios.
    • Max Duration: 2-10 seconds for reference-to-video (R2V) editing; supports 2-15s for related generation modes.
    • Input Formats: Video inputs with optional joint image, video, and audio references (up to 5 simultaneous for multi-subject control); text instructions for edits.
    • Output Formats: High-quality video with preserved or regenerated native audio; supports first/last frame control.
    • Processing Time: Serverless deployment optimized for efficient editing; exact times vary by complexity and references.
    • Architecture: Built on Wan model family with temporal feature transfer for motion preservation and multi-reference consistency.
  • Things to be aware of

    Complex multi-reference setups (up to 5 inputs) may introduce minor inconsistencies in highly dynamic scenes. Edge cases like rapid motion or low-light inputs can affect temporal transfer precision. Common mistakes include vague prompts lacking temporal details, leading to altered motions—always specify preservation. Resource needs scale with references; test on each::labs for API quotas. Audio sync works best with clear source voice; noisy inputs may require regeneration. Avoid overlong videos beyond 10s to prevent quality drops.

  • Key considerations

    Before using Alibaba | Wan 2.7 | Video Edit, ensure input videos are 2-10 seconds to match optimal performance windows. It requires clear text instructions or reference media for best results, with up to 5 references enhancing multi-subject accuracy. Best for targeted edits like style transfers or object modifications rather than full recreations, outperforming in scenarios needing motion fidelity. On each::labs, leverage the Alibaba | Wan 2.7 | Video Edit API for scalable Alibaba video-to-video tasks. Consider cost tradeoffs: efficient for short clips but may increase with multiple references. No local deployment yet; cloud access via API is standard.

  • Limitations

    Alibaba | Wan 2.7 | Video Edit caps at 2-10s for reference editing, unsuitable for longer formats. Multi-subject handling is strong up to 5 references but may falter in overcrowded compositions. No confirmed 4K video support yet, sticking to 1080p. Open weights pending; cloud-only access currently. Fails on extreme deformations or non-human subjects without strong references. Input videos must be short to avoid processing issues.

Related models

4 models