PixVerse Swap

Video·PixVerse Features·by Pixverse

PixVerse Swap replaces a subject or object in an existing video with a reference image. Provide a video and the new image, and Swap automatically targets the primary detected subject (face, body, or object). v1 caveat: the first detected subject (mask_info[0]) is auto-picked. Up to 720p; the source video codec must be h.264 or h.265.

Runtime (p50)
3m
Estimated price
$0.005 / credit
Call the API
prediction.sh
sh
curl -X POST \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "pixverse-swap",
    "version": "0.0.1",
    "input": {
        "quality": "720p",
        "video_url": "https://cdn-us.eachlabs.ai/uploads/525a3fd9-75ac-4c92-a35d-9f6151e64519.mp4",
        "keyframe_id": 1,
        "replacement_image_url": "https://cdn-us.eachlabs.ai/uploads/ecb32c29-a1f3-4de8-8a6e-a2e82554f87b.png"
    },
    "webhook_url": ""
}' \
  https://api.eachlabs.ai/v1/prediction/
Documentation8 sections
  • Overview

    PixVerse | Swap | Object Swap in Video Overview

    PixVerse | Swap | Object Swap in Video is a specialized video-to-video AI model from Pixverse that enables seamless replacement of subjects or objects in existing videos using a reference image. Users upload a source video and a new image, and the model automatically detects and swaps the primary subject—such as a face, body, or object—while preserving motion and scene coherence. This stands out by auto-targeting the first detected subject (mask_info) for precise, effortless editing without manual masking.

    Developed by Pixverse, known for advanced models like V6 and C1 with cinematic transitions and native audio, PixVerse | Swap | Object Swap in Video extends their video-to-video capabilities to practical object manipulation. Available via the PixVerse | Swap | Object Swap in Video API on platforms like each::labs, it supports up to 720p resolutions and requires H.264 or H.265 source codecs, making it ideal for quick video repurposing in content creation workflows.

  • Capabilities

    Capabilities

    • Automatically detects and swaps primary subjects (faces, bodies, objects) using reference images
    • Preserves original video motion, lighting, and physics for seamless integration
    • Supports up to 720p resolution with H.264/H.265 input compatibility
    • Targets first detected mask (mask_info) for efficient, no-manual-intervention editing
    • Handles diverse objects from people to props in dynamic scenes
    • Integrates with Pixverse video-to-video pipeline for extended creative control
    • Accessible via PixVerse | Swap | Object Swap in Video API for developer workflows
  • Use cases

    Use Cases for PixVerse | Swap | Object Swap in Video

    Content Creators: Replace actors in footage for personalized videos. Example: Upload a walking clip and celebrity photo; prompt "swap face with reference, keep gait." Ideal for fan edits or demos.

    Marketers: Swap products in promotional videos. Example: Change a bottle in a commercial with "replace soda can with energy drink, match pour motion," leveraging object detection for brand swaps.

    Designers: Prototype visuals by swapping elements in mockups. Example: "Object swap chair with modern sofa in room tour," using motion preservation for client previews.

    Developers: Build apps with PixVerse | Swap | Object Swap in Video API on each::labs. Example: Integrate for real-time avatar swaps in video calls, auto-targeting faces for consistency.

  • Tips & tricks

    Tips and Tricks

    For best results with PixVerse | Swap | Object Swap in Video, use high-contrast reference images with clear subject outlines to improve detection accuracy. Pre-crop your source video to feature the target subject prominently, as v1 prioritizes mask_info. Add descriptive prompts like "replace the car with a motorcycle, maintain speed and lighting" to guide motion consistency.

    Optimize workflows by testing at lower resolutions first, then upscale. Combine with Pixverse video-to-video enhancements for refined physics. Example prompts:

    • "Swap the person's face with reference image, keep walking motion natural."
    • "Replace the bottle on table with a vase, match camera pan."
    • "Object swap dog with cat in park scene, preserve fur dynamics."

    These techniques, informed by Pixverse model behaviors, maximize swap precision on each::labs.

  • Technical spec

    Technical Specifications

    • Resolution Support: Up to 720p for output videos
    • Input Formats: Source video in H.264 or H.265 codec; reference image for swap target
    • Output Format: Standard video file compatible with common editors
    • Subject Detection: Automatic targeting of primary detected subject (mask_info in v1); supports faces, bodies, or objects
    • Processing Time: Efficient for short clips, typically seconds to minutes depending on length and complexity
    • Aspect Ratios: Matches source video; flexible for various formats like 16:9 or 9:16
    • Max Duration: Suitable for standard video clips; optimized for coherence in motion preservation

    These specs draw from Pixverse's video-to-video lineage, emphasizing compatibility and speed for practical use on each::labs.

  • Things to be aware of

    Things to Be Aware Of

    PixVerse | Swap | Object Swap in Video may struggle in crowded scenes where the primary subject isn't mask_info, leading to incorrect swaps—pre-edit videos to focus targets. Complex motions like fast rotations can cause minor artifacts in swapped elements. Users often overlook codec requirements, causing upload failures; always verify H.264/H.265.

    Edge cases include low-light videos or heavily occluded objects, reducing detection accuracy. Resource needs are modest, but longer clips increase processing time on each::labs.

  • Key considerations

    Key Considerations

    Before using PixVerse | Swap | Object Swap in Video, ensure your source video uses H.264 or H.265 codecs to avoid compatibility issues. The model auto-selects the first detected subject, so complex scenes with multiple objects may require pre-editing the video to isolate the target. It's best for scenarios needing quick subject replacement over full scene generation, offering faster results than text-to-video alternatives.

    Performance scales with video length and resolution—stick to shorter clips under 15 seconds for optimal quality. On each::labs, leverage the PixVerse | Swap | Object Swap in Video API for batch processing, balancing cost with high-fidelity swaps in marketing or personal projects.

  • Limitations

    Limitations

    PixVerse | Swap | Object Swap in Video is capped at 720p and auto-picks the first detected subject in v1, limiting flexibility in multi-object scenes. It cannot handle non-H.264/H.265 inputs or generate new motions beyond source preservation. Quality drops in extreme angles or poor reference images, and no native audio swap is supported.

    ---

Related models

4 models