Eachlabs | AI Workflows for app builders

Google Veo 3

Sound on: Google’s flagship Veo 3 text to video model, with audio

Avg Run Time: 90.000s

Model Slug: veo-3

Category: Text to Video

Input

Advanced Controls

Output

Example Result

Preview and download your result.

Unsupported conditions - pricing not available for this input format

Create a Prediction

Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.

Get Prediction Result

Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.

Table of Contents
Overview
Technical Specifications
Key Considerations
Tips & Tricks
Capabilities
What Can I Use It For?
Things to Be Aware Of
Limitations

Overview

Google Veo 3 is designed for high-quality text-to-video generation. It takes textual prompts and optionally a seed value to produce short videos with cinematic coherence. Google Veo 3 focuses on realism, detail, and smooth camera motion. It supports a wide range of video types such as aerial shots, slow motion, and first-person perspective videos, offering creative control through natural language descriptions.

Technical Specifications

Frame Rate: Up to 30 fps.

Resolution: Supports generation up to 1080p in preview modes.

Motion Awareness: Integrated camera and object motion tracking.

Semantic Understanding: Natural language input with multi-modal scene grounding.

Consistency: Improved temporal and spatial coherence across frames.

Key Considerations

Video duration is fixed to short clips and cannot be extended beyond a few seconds per run.

Input text is the sole control mechanism; no image, audio, or video input is supported.

Outputs may occasionally contain unnatural object deformations or flickering.

Explicit, graphic, or flagged terms may cause failure or result in blank output.

Abstract prompts may lead to hallucinated or visually ambiguous results.

Real names, brands, or sensitive entities should be avoided in prompts.

Tips & Tricks

Prompt (String):

  • Use clear and concrete descriptions:
    ✅ “A white horse galloping through a foggy forest at sunrise”
    ❌ “Freedom in nature”
  • For dynamic motion:
    "A drone shot of a bustling city at night with moving cars and bright lights"
  • To describe visual style:
    "In the style of a cinematic sci-fi film, with dramatic lighting" 
  • Avoid flagged or unsafe content such as:
    • Violence: "explosions", "war scenes"
    • NSFW: "nudity", "suggestive themes"
    • Identifiable individuals or real persons

Seed (Integer):

  • Controls variation for repeatable results.
  • Use values like 12345, 42, etc. for consistent video output across different generations.
  • Changing the seed results in slightly different visuals with the same prompt.
  • Leaving seed empty allows random variation.

Capabilities

Generate short cinematic-style videos from natural language.

Supports descriptions of motion, objects, scenery, and atmosphere.

Capable of handling various themes like nature, futuristic, urban, fantasy, and more.

Supports camera controls like zoom, pan, dolly, and aerial views through language.

What Can I Use It For?

Creating short visual concepts for storytelling or ideation

Generating teaser videos for creative projects

Producing aesthetic motion clips based on detailed prompts

Visualizing scene ideas for design, animation, or narrative planning

Things to Be Aware Of

Combine camera directions with settings:
“A slow pan across a desert at golden hour”

Mix motion and mood:
“A handheld shot following a child running through a sunflower field in slow motion”

Experiment with time of day and lighting:
“A mountain village at dusk, with lights flickering on and smoke rising from chimneys”

Add genre-based visual tones:
“Cyberpunk city with neon signs and rainy streets, drone footage”

Limitations

Realism may degrade with overly abstract prompts

May generate flickering or frame inconsistencies

No interactive editing or feedback loop — one-shot generation

Prompts involving copyrighted characters or brands may fail

Output Type: MP4

Pricing Type: Dynamic

Dynamic pricing based on input conditions

Conditions

SequenceDurationGenerate_audioPrice
1"4s"""$0.80
2"4s"""$1.60
3"6s"""$1.20
4"6s"""$2.40
5"8s"""$1.60
6"8s"""$3.20