How do I use VEED Fabric 1.0 via API?

VEED Fabric 1.0 is available through the eachlabs unified API. Provide a source image with optional motion or style parameters; the model returns an animated video. Billing is pay-as-you-go through eachlabs no VEED account is required.

What is VEED Fabric 1.0 best suited for?

VEED Fabric 1.0 is best suited for converting static photography and design assets into engaging short-form video content. It works well for social media managers, content creators, and brands that need to quickly produce animated video from their existing image libraries.

Veed · Fabric 1.0

Video·veed-fabric·by VEED

Veed Fabric-1.0 is an image-to-video model that generates talking videos from a single face image and an audio input. The model synchronizes the mouth and facial movements with the provided speech, producing short lip-synced clips ideal for social media, quick presentations, and prototyping.

Try it now →

API reference

Runtime (p50): 3m
Estimated price: From $0.08

Call the API

prediction.sh

curl -X POST \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "veed-fabric-1-0",
    "version": "0.0.1",
    "input": {
        "image_url": "https://storage.googleapis.com/magicpoint/inputs/veed-fabric-1-0-input.jpg",
        "audio_url": "https://storage.googleapis.com/magicpoint/inputs/veed-fabric-1-0-input.mp3",
        "resolution": "720p"
    },
    "webhook_url": ""
}' \
  https://api.eachlabs.ai/v1/prediction/

Documentation8 sections

Overview
veed-fabric-1.0 — Image-to-Video AI Model

veed-fabric-1.0 from VEED transforms a single static image and audio input into dynamic, lip-synced talking videos, solving the challenge of creating engaging speech-driven content without filming, motion capture, or complex editing. This image-to-video AI model excels at synchronizing precise lip movements, natural head tilts, micro-expressions, and subtle body cues with speech rhythm and emotion, producing realistic animations up to 5 minutes long. Developed as part of the veed-fabric family, veed-fabric-1.0 powers scalable production of avatars, explainer videos, and social media clips, making it ideal for creators seeking VEED image-to-video tools that deliver emotionally aligned results seven times faster than many competitors.
Capabilities
- Generates highly realistic talking videos from a single image and audio input.
- Accurately synchronizes lip, facial, and head movements with speech, including expressive gestures.
- Supports a wide range of input images: real photos, illustrations, mascots, and stylized characters.
- Maintains the original style and identity of the input image in the animated output.
- Produces videos in multiple aspect ratios and resolutions suitable for various platforms.
- Enables programmatic generation via API for automated content workflows.
- Handles both human and non-human (e.g., pets, cartoon) characters for diverse creative applications.
Use cases
Use Cases for veed-fabric-1.0

Social media creators use veed-fabric-1.0 to animate a single portrait image with custom audio, generating TikTok-ready talking-head videos in 9:16 format that match speech emotion without filming sessions. For instance, upload a brand mascot photo and audio saying "Discover our new eco-friendly line with sustainable materials," yielding a lip-synced clip with nodding head movements and enthusiastic expressions.

Marketers leverage its localization power by swapping audio tracks into existing avatar images, creating multilingual explainer videos up to 5 minutes for global campaigns while maintaining facial consistency. This streamlines VEED image-to-video production for personalized customer outreach.

Developers building image-to-video AI model apps integrate the veed-fabric-1.0 API to automate educational content, turning instructor photos into engaging lectures with natural gestures synced to scripted speech.

Educators and businesses prototype presentations by animating professional headshots with voiceovers, producing 720p 16:9 clips featuring subtle hand cues and emphasis-based expressions for polished, scalable training modules.
Tips & tricks
How to Use veed-fabric-1.0 on Eachlabs

Access veed-fabric-1.0 seamlessly through Eachlabs Playground for instant testing or via API/SDK for production apps—upload a static image (portraits, characters, or mascots) and audio/script, select resolution (up to 720p), aspect ratio (16:9, 9:16, 1:1), and duration (up to 5 minutes). Generate lip-synced, expressive videos with natural motions in seconds, ready for download in standard formats optimized for social and professional use.
---
Technical spec
What Sets veed-fabric-1.0 Apart

veed-fabric-1.0 stands out in the image-to-video AI model landscape with its Diffusion Transformer (DiT) architecture, which generates temporally consistent frames while analyzing audio for intonation, pacing, and tone to create expressive animations beyond basic lip-sync. This enables natural-feeling videos with dynamic facial expressions and subtle gestures, perfect for high-volume workflows like personalized messaging or automated localization.

Unlike generic animation tools, it preserves style across photorealistic portraits, cartoons, brand mascots, and stylized renders, ensuring visual consistency for custom characters. Users benefit from rapid iteration in social media or educational content pipelines without retraining or post-processing.

Key technical specs include resolutions up to 720p, aspect ratios of 16:9, 9:16, and 1:1, video lengths to 5 minutes, and fast generation speeds optimized for scalable veed-fabric-1.0 API integration.
- Precise audio-driven lip sync with emotional micro-expressions, enabling authentic talking avatars from one image.
- Multi-style support for humans, illustrations, and mascots, ideal for brand-consistent video assets.
- High-speed output (7x faster than peers) with flexible formats for TikTok, Reels, and presentations.
Things to be aware of
- Some users report that the model excels at lip sync and expressive facial animation, especially with high-quality inputs.
- The model is praised for its flexibility in animating a wide range of images, not just preset avatars.
- Generation time can be significant for longer or higher-resolution videos; plan accordingly for batch processing.
- Users note that results may vary with stylized or heavily edited images, sometimes requiring multiple attempts for optimal output.
- The model’s ability to animate non-human characters (e.g., pets, cartoons) is seen as a unique strength, though mouth movement accuracy may depend on the clarity of the mouth in the image.
- Community feedback highlights the ease of use and the quality of outputs for social media and marketing.
- Some users mention that extreme facial angles, occlusions, or low-resolution images can reduce animation quality or cause artifacts.
- There is positive feedback on the model’s ability to maintain the original style and personality of the input image.
- Negative feedback patterns include occasional mismatches between audio and lip movement, especially with unclear audio or ambiguous mouth shapes.
Key considerations
- The quality of the input image and audio significantly affects the realism and expressiveness of the output video.
- For best results, use clear, high-resolution images with a well-lit, unobstructed face.
- Audio should be clean, with minimal background noise, and closely match the intended lip movements.
- The model supports a wide range of aspect ratios, but output resolution may be scaled to fit the source image’s dimensions.
- Longer videos (up to 1 minute) are supported, but generation time increases with length and resolution.
- Prompt engineering can involve combining stylized images or edited photos for creative effects.
- There is a trade-off between speed and quality: higher resolutions and longer clips require more processing time.
- Avoid images with extreme facial angles or heavy occlusions, as these may reduce animation accuracy.
Limitations
- The model may struggle with images featuring extreme facial angles, heavy occlusions, or very low resolution.
- Lip sync accuracy can decrease with unclear audio, non-standard speech, or stylized characters lacking defined mouth areas.
- Generation times are relatively long for high-resolution or extended video outputs, which may impact real-time or high-volume use cases.

Related models

4 models

Kling v3 4K · Image to Video AI model preview

Kling v3 4K · Image to VideoKling

P Video AnimatePruna AI

P Video AvatarPruna AI

Bytedance Seedance 2.0 Image to Video · Fast AI model preview

Bytedance Seedance 2.0 Image to Video · FastBytedance

* FAQ

About Veed · Fabric 1.0

01 / 03

What is VEED Fabric 1.0?

VEED Fabric 1.0 is an AI image-to-video model by VEED that animates still images into short video clips. It applies motion generation and scene dynamics to produce polished, shareable video content from a single image, suited for social media, marketing, and content creation workflows.

Veed · Fabric 1.0

veed-fabric-1.0 — Image-to-Video AI Model

Use Cases for veed-fabric-1.0

How to Use veed-fabric-1.0 on Eachlabs

What Sets veed-fabric-1.0 Apart

Related models

About Veed · Fabric 1.0

What is VEED Fabric 1.0?