Eachlabs | AI Workflows for app builders
pixverse-swap

PIXVERSE FEATURES

PixVerse Swap replaces a subject or object in an existing video with a reference image. Provide a video and the new image, and Swap automatically targets the primary detected subject (face, body, or object). v1 caveat: the first detected subject (mask_info[0]) is auto-picked. Up to 720p; the source video codec must be h.264 or h.265.

Official Partner

Avg Run Time: 160.000s

Model Slug: pixverse-swap

Playground

Input

Enter a URL or choose a file from your computer.

Enter a URL or choose a file from your computer.

Output

Example Result

Preview and download your result.

PixVerse Swap. Per-second pricing: 360p 9 cred/s, 540p 9, 720p 12. Mask selection adds ~2 credits. $1 = 200 credits.

API & SDK

Create a Prediction

Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.

Get Prediction Result

Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.

Readme

Table of Contents
Overview
Technical Specifications
Key Considerations
Tips & Tricks
Capabilities
What Can I Use It For?
Things to Be Aware Of
Limitations

Overview

PixVerse | Swap | Object Swap in Video Overview

PixVerse | Swap | Object Swap in Video is a specialized video-to-video AI model from Pixverse that enables seamless replacement of subjects or objects in existing videos using a reference image. Users upload a source video and a new image, and the model automatically detects and swaps the primary subject—such as a face, body, or object—while preserving motion and scene coherence. This stands out by auto-targeting the first detected subject (mask_info) for precise, effortless editing without manual masking.

Developed by Pixverse, known for advanced models like V6 and C1 with cinematic transitions and native audio, PixVerse | Swap | Object Swap in Video extends their video-to-video capabilities to practical object manipulation. Available via the PixVerse | Swap | Object Swap in Video API on platforms like each::labs, it supports up to 720p resolutions and requires H.264 or H.265 source codecs, making it ideal for quick video repurposing in content creation workflows.

Technical Specifications

Technical Specifications
  • Resolution Support: Up to 720p for output videos
  • Input Formats: Source video in H.264 or H.265 codec; reference image for swap target
  • Output Format: Standard video file compatible with common editors
  • Subject Detection: Automatic targeting of primary detected subject (mask_info in v1); supports faces, bodies, or objects
  • Processing Time: Efficient for short clips, typically seconds to minutes depending on length and complexity
  • Aspect Ratios: Matches source video; flexible for various formats like 16:9 or 9:16
  • Max Duration: Suitable for standard video clips; optimized for coherence in motion preservation

These specs draw from Pixverse's video-to-video lineage, emphasizing compatibility and speed for practical use on each::labs.

Key Considerations

Key Considerations

Before using PixVerse | Swap | Object Swap in Video, ensure your source video uses H.264 or H.265 codecs to avoid compatibility issues. The model auto-selects the first detected subject, so complex scenes with multiple objects may require pre-editing the video to isolate the target. It's best for scenarios needing quick subject replacement over full scene generation, offering faster results than text-to-video alternatives.

Performance scales with video length and resolution—stick to shorter clips under 15 seconds for optimal quality. On each::labs, leverage the PixVerse | Swap | Object Swap in Video API for batch processing, balancing cost with high-fidelity swaps in marketing or personal projects.

Tips & Tricks

Tips and Tricks

For best results with PixVerse | Swap | Object Swap in Video, use high-contrast reference images with clear subject outlines to improve detection accuracy. Pre-crop your source video to feature the target subject prominently, as v1 prioritizes mask_info. Add descriptive prompts like "replace the car with a motorcycle, maintain speed and lighting" to guide motion consistency.

Optimize workflows by testing at lower resolutions first, then upscale. Combine with Pixverse video-to-video enhancements for refined physics. Example prompts:

  • "Swap the person's face with reference image, keep walking motion natural."
  • "Replace the bottle on table with a vase, match camera pan."
  • "Object swap dog with cat in park scene, preserve fur dynamics."

These techniques, informed by Pixverse model behaviors, maximize swap precision on each::labs.

Capabilities

Capabilities
  • Automatically detects and swaps primary subjects (faces, bodies, objects) using reference images
  • Preserves original video motion, lighting, and physics for seamless integration
  • Supports up to 720p resolution with H.264/H.265 input compatibility
  • Targets first detected mask (mask_info) for efficient, no-manual-intervention editing
  • Handles diverse objects from people to props in dynamic scenes
  • Integrates with Pixverse video-to-video pipeline for extended creative control
  • Accessible via PixVerse | Swap | Object Swap in Video API for developer workflows

What Can I Use It For?

Use Cases for PixVerse | Swap | Object Swap in Video

Content Creators: Replace actors in footage for personalized videos. Example: Upload a walking clip and celebrity photo; prompt "swap face with reference, keep gait." Ideal for fan edits or demos.

Marketers: Swap products in promotional videos. Example: Change a bottle in a commercial with "replace soda can with energy drink, match pour motion," leveraging object detection for brand swaps.

Designers: Prototype visuals by swapping elements in mockups. Example: "Object swap chair with modern sofa in room tour," using motion preservation for client previews.

Developers: Build apps with PixVerse | Swap | Object Swap in Video API on each::labs. Example: Integrate for real-time avatar swaps in video calls, auto-targeting faces for consistency.

Things to Be Aware Of

Things to Be Aware Of

PixVerse | Swap | Object Swap in Video may struggle in crowded scenes where the primary subject isn't mask_info, leading to incorrect swaps—pre-edit videos to focus targets. Complex motions like fast rotations can cause minor artifacts in swapped elements. Users often overlook codec requirements, causing upload failures; always verify H.264/H.265.

Edge cases include low-light videos or heavily occluded objects, reducing detection accuracy. Resource needs are modest, but longer clips increase processing time on each::labs.

Limitations

Limitations

PixVerse | Swap | Object Swap in Video is capped at 720p and auto-picks the first detected subject in v1, limiting flexibility in multi-object scenes. It cannot handle non-H.264/H.265 inputs or generate new motions beyond source preservation. Quality drops in extreme angles or poor reference images, and no native audio swap is supported.

---

Pricing

Pricing Type: Dynamic

PixVerse Swap. Per-second pricing: 360p 9 cred/s, 540p 9, 720p 12. Mask selection adds ~2 credits. $1 = 200 credits.

Current Pricing

PixVerse Swap. Per-second pricing: 360p 9 cred/s, 540p 9, 720p 12. Mask selection adds ~2 credits. $1 = 200 credits.