ByteDance Seedance 2.0 Mini · Reference to Video

Video·seedance-2.0·by Bytedance

Seedance 2.0 Mini Reference-to-Video creates clips from reference images, keeping characters and scenes consistent across shots. AI video for branded content.

Runtime (p50)
4m
Estimated price
From $0.06
Call the API
prediction.sh
sh
curl -X POST \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "bytedance-seedance-2-0-mini-reference-to-video",
    "version": "0.0.1",
    "input": {
        "prompt": "Ultra-realistic cinematic product video of a pink and white chunky sole sneaker pair. Fast dynamic cuts: front view dead-on, then quick spin to side profile showing sole detail, sharp cut to close-up of mesh texture and laces, fast low-angle shot from ground level looking up at the chunky sole, quick overhead flat lay with rose petals scattered around, dramatic 3/4 angle shot with soft studio light raking across the surface highlighting the leather and mesh texture, final slow push-in to the toe box. Clean white background throughout. Soft pink rim lighting. Each shot holds for 0.5 seconds. Sharp fast transitions, no fades. Cinematic 4K product commercial style.",
        "image_urls": [
            "https://storage.googleapis.com/magicpoint/inputs/bytedance-seedance-2-0-reference-to-video-fast-input.png"
        ],
        "resolution": "720p",
        "duration": "6",
        "generate_audio": true,
        "aspect_ratio": "16:9"
    },
    "webhook_url": ""
}' \
  https://api.eachlabs.ai/v1/prediction/
Documentation8 sections
  • Overview

    ByteDance | Seedance 2.0 | Mini | Reference to Video Overview

    ByteDance | Seedance 2.0 | Mini | Reference to Video is a compact video editing and generation model that applies prompt-guided edits to an existing source video, optionally using additional visual or audio references to steer the result. Built on ByteDance’s Seedance 2.0 family, it focuses on efficient reference-to-video workflows rather than full-size, long-form generation. This makes it well-suited for creators and developers who need fast, controllable edits over raw generative output. On each::labs, ByteDance | Seedance 2.0 | Mini | Reference to Video lets you define the edit prompt, upload a source clip, and optionally add a reference image, video, or audio snippet, along with basic controls such as target resolution, clip duration, and audio handling. The primary differentiator is its emphasis on using multiple reference modalities to drive consistent, high-fidelity edits while staying lightweight enough for iterative experimentation.

  • Capabilities

    Capabilities

    • Prompt-guided video editing: Apply semantic edits to an existing video using natural language instructions while preserving core motion and scene structure.
    • Visual reference conditioning: Use reference images or videos to drive style transfer, color grading, and design elements in the edited output.
    • Audio-aware workflows: Optionally keep, mute, or swap the original audio track, allowing music or dialogue to be retained while visuals change.
    • Short-form optimization: Tailored for brief clips, making it suitable for social posts, teasers, and rapid iterations rather than long-form productions.
    • Consistent character and motion preservation: Designed to maintain the core subject and motion trajectory while modifying appearance or environment.
    • Flexible aspect handling: Works with common landscape and portrait formats, fitting vertical video needs for mobile-first platforms.
    • Developer-friendly integration: Through the ByteDance | Seedance 2.0 | Mini | Reference to Video API on each::labs, developers can script batch edits and automated pipelines.
  • Use cases

    Use Cases for ByteDance | Seedance 2.0 | Mini | Reference to Video

    ByteDance | Seedance 2.0 | Mini | Reference to Video is valuable for creators who want to re-style existing footage without complex manual compositing. A video creator can take a simple studio dance clip and, using a fantasy concept art reference, transform it into a magical forest scene while keeping the choreography intact, with a prompt like: “keep the dancer’s motion, replace the stage with a glowing enchanted forest.” Marketers can rapidly localize campaign footage by changing backgrounds and color schemes to match different regions or brands, for example: “maintain the actor and product, change the room to a minimalist white studio matching this reference image.” Designers can prototype motion mockups by re-skinning placeholder footage into a specific art direction guided by a moodboard video. Developers can call the ByteDance | Seedance 2.0 | Mini | Reference to Video API from back-end services to auto-generate variant edits of user-uploaded clips for personalization.

  • Tips & tricks

    Tips and Tricks

    To get the most out of ByteDance | Seedance 2.0 | Mini | Reference to Video, write prompts that explicitly separate what must stay the same from what should change. Mention the camera angle, subject, and motion you want to preserve, then describe the visual style, lighting, or costume to alter. When using a reference clip or image, ensure it clearly exhibits the target color palette, texture, or vibe; subtle references can lead to weak transfer. Keep durations short for early iterations and increase length only after you are satisfied with the style. For the ByteDance | Seedance 2.0 | Mini | Reference to Video API, pass explicit resolution and duration parameters rather than relying on defaults so your pipeline is predictable. Helpful prompt patterns include phrases like “keep the original motion” or “only change the background.”

    Example prompts:

    • "Keep the person and their movement, but turn the scene into a neon cyberpunk city at night, inspired by this reference video."
    • "Preserve the dance choreography, re-style the dancer’s outfit to a fantasy knight armor, using this image as the style reference."
    • "Maintain the original camera motion, replace the background with a sunset beach, and match colors to this reference clip."
  • Technical spec

    Technical Specifications

    While ByteDance does not publicly document all internal details of Seedance 2.0 Mini, typical usage on each::labs follows these practical specifications:

    • Input type: Source video (short clip), plus a required text prompt; optional reference image, video, or audio track.
    • Output type: Edited video clip with visual changes guided by the prompt and references; optional preserved, replaced, or muted audio.
    • Resolution: Configurable within a “Mini” range (commonly up to HD-like resolutions such as 720p); higher resolutions may increase latency and are often downsampled internally.
    • Duration: Optimized for short-form edits; longer videos are typically truncated or processed as segments.
    • Aspect ratios: Supports standard landscape and portrait ratios; nonstandard ratios may be padded or cropped.
    • Formats: Standard web video formats for input and output (for example, MP4 containers with common codecs), as supported by each::labs.
    • Processing time: Designed for interactive editing; latency scales with resolution, duration, and number of reference inputs.
    • Architecture: Based on ByteDance’s Seedance 2.0 diffusion-style video generation/editing stack, tuned for reference-conditioned editing in a smaller “Mini” configuration.
  • Things to be aware of

    Things to Be Aware Of

    Complex scenes with many small moving objects can challenge the model, leading to artifacts or partial edits. If the prompt is vague about what should remain unchanged, ByteDance | Seedance 2.0 | Mini | Reference to Video may over-edit and alter subjects you intended to preserve. Very long videos may be truncated, processed in chunks, or downsampled, affecting temporal consistency. Nonstandard aspect ratios can introduce padding or cropping, so you should frame important content centrally. Using reference videos or images that conflict with the source motion (for example, very different camera angles) can reduce edit quality. When integrating via the Bytedance reference-to-video API on each::labs, you should monitor latency as you increase resolution, duration, and number of references.

  • Key considerations

    Key Considerations

    ByteDance | Seedance 2.0 | Mini | Reference to Video is best used when you already have a base clip and want to change style, mood, or specific visual elements without reshooting. You should prepare a clearly visible source video and concise text instructions to help the model understand which aspects to modify and which to preserve. Shorter clips with stable framing generally yield more consistent edits than long or highly chaotic footage. The Mini variant prioritizes responsiveness over ultra-high resolution, making it ideal for social content, rapid A/B testing, and prototyping. For workflows demanding long-duration or 4K master outputs, it is usually better to treat this model as a sketch and then upscale or post-process downstream.

  • Limitations

    Limitations

    ByteDance | Seedance 2.0 | Mini | Reference to Video is not intended for frame-perfect VFX replacement or production-grade compositing; fine details may flicker between frames. Ultra-high-resolution or very long clips may require external upscaling or segmentation to achieve professional delivery specs. The model cannot infer complex story changes that contradict the underlying motion, so radical scene rewrites may look inconsistent. Highly text-heavy scenes, detailed logos, or UI elements may blur or distort during editing. As with most Bytedance reference-to-video tools, results depend heavily on the clarity of the source footage, the strength of the references, and precise prompting.

Related models

4 models
* FAQ

About ByteDance Seedance 2.0 Mini · Reference to Video

01 / 03

What is Seedance 2.0 Mini Reference-to-Video?

Seedance 2.0 Mini Reference-to-Video is a model from ByteDance that builds a video from reference material rather than a single image. You can supply several reference images, plus reference video and audio, to guide the style, subject, and motion, and the model weaves them into one coherent clip