Example inputhover

prompt: "Shot 1 (wide, 0–2s): From behind, a blonde braided woman and a chestnut horse walk slowly together across an open field. No reins between them — the horse stays at her shoulder by choice. Golden afternoon light stretches across the grass ahead, a gentle wind moving through the landscape. The horizon is wide and still.\nShot 2 (mid, 2–3.5s): The camera moves softly to the side, following them. Her steps and the horse's hooves fall at the same quiet pace. Her braid sways gently. The horse's mane lifts in the breeze. Neither looks at the other — they simply walk, side by side, completely at ease.\nShot 3 (wide, 3.5–5s): The camera holds wide as they continue forward — two small figures in a vast golden field, growing no closer to anything, needing nowhere to be. The light is warm and fading. The grass bends. They keep walking.\nCinematic, natural golden hour light, shallow depth of field, slow and unhurried — quiet, emotional, simple."
duration: 5
resolution: "720P"
first_frame

Happyhorse 1.0 API

Video·happyhorse-1.0·by Alibaba

Generates video from images while preserving key details like subject, style, and text elements with high visual consistency across dynamic transitions.

Try it now →

API reference

Runtime (p50): 1m
Estimated price: From $0.14

Call the API

prediction.sh

curl -X POST \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "alibaba-happyhorse-1-0-image-to-video",
    "version": "0.0.1",
    "input": {
        "prompt": "Shot 1 (wide, 0–2s): From behind, a blonde braided woman and a chestnut horse walk slowly together across an open field. No reins between them — the horse stays at her shoulder by choice. Golden afternoon light stretches across the grass ahead, a gentle wind moving through the landscape. The horizon is wide and still.\\nShot 2 (mid, 2–3.5s): The camera moves softly to the side, following them. Her steps and the horse's hooves fall at the same quiet pace. Her braid sways gently. The horse's mane lifts in the breeze. Neither looks at the other — they simply walk, side by side, completely at ease.\\nShot 3 (wide, 3.5–5s): The camera holds wide as they continue forward — two small figures in a vast golden field, growing no closer to anything, needing nowhere to be. The light is warm and fading. The grass bends. They keep walking.\\nCinematic, natural golden hour light, shallow depth of field, slow and unhurried — quiet, emotional, simple.",
        "duration": 5,
        "resolution": "720P",
        "first_frame": "https://cdn-us.eachlabs.ai/uploads/3771631f-ba57-426a-a159-b63f5ef6cda1.png"
    },
    "webhook_url": ""
}' \
  https://api.eachlabs.ai/v1/prediction/

Documentation8 sections

Overview
Alibaba | HappyHorse 1.0 | Image to Video Overview

Alibaba | HappyHorse 1.0 | Image to Video transforms a single input image into physically realistic videos with smooth, natural motion, optionally guided by a text prompt. This Alibaba Cloud model, part of their advanced video generation suite, solves the challenge of animating static images into dynamic content while preserving the input's aspect ratio automatically. Its primary differentiator is the "First-frame-to-video" capability, using the provided image as the exact starting frame for seamless video extension. Ideal for creators needing quick, high-fidelity animations without complex setups, it supports 720P or 1080P resolutions and 3-15 second durations. Available via Alibaba Cloud Model Studio, this image-to-video tool stands out for realistic physics simulation in motion generation.
Capabilities
Capabilities
- Generates physically realistic videos from a single input image as the first frame
- Supports optional text prompts to guide motion and scene dynamics
- Auto-matches output aspect ratio to input image for perfect fidelity
- Produces smooth, natural motion with accurate physics simulation
- Offers 720P and 1080P resolutions for versatile quality needs
- Creates 3-15 second videos ideal for social media and ads
- Integrates with Alibaba Cloud's multimodal ecosystem, including Qwen models
- Handles diverse styles from natural scenes to dynamic actions
Use cases
Use Cases for Alibaba | HappyHorse 1.0 | Image to Video

Content Creators: Animate product photos into engaging demos. Upload a static image of a gadget and prompt: "Device rotating 360 degrees on a sleek table with soft lighting." Leverages first-frame fidelity for professional reveals.

Marketers: Turn static ad visuals into video assets. Use a brand logo image with "Logo pulsing with energy waves, transitioning to product shot"—ideal for social reels with auto-aspect matching.

Designers: Prototype motion graphics from sketches. Input a concept art frame and add "Elements floating upward in zero gravity, colors shifting gradually" for quick storyboards.

Developers: Build interactive apps via Alibaba | HappyHorse 1.0 | Image to Video API. Integrate user-uploaded images for personalized animations, like "Portrait smiling and waving naturally," enhancing AR experiences.
Tips & tricks
Tips and Tricks

For best results with Alibaba | HappyHorse 1.0 | Image to Video, use descriptive text prompts focusing on motion and physics, like specifying "gentle waving grass in wind" to enhance realism. Start with high-resolution input images (at least 720P) to match output quality. Optimize by keeping prompts concise—under 50 words—to avoid dilution of motion intent. Experiment with duration settings: shorter 3-5 second clips yield smoother motion than max 15 seconds.

Example prompts:
- "A horse galloping across a sunny field, dust kicking up realistically from hooves."
- "Waves crashing on rocky shore, foam spraying with natural physics."
- "Leaves rustling in breeze on a forest path, camera panning slowly right."
Combine with Alibaba's Qwen image analysis for refined inputs, iterating prompts based on preview frames.
Technical spec
Technical Specifications
- Resolution Support: 720P or 1080P output videos
- Duration: 3-15 seconds
- Aspect Ratio: Automatically matches the input image
- Input: Single first-frame image (as starting frame) + optional text prompt
- Output Format: Video file with smooth motion and physically realistic dynamics
- Processing: Dynamically scheduled inference resources (global, excluding Chinese mainland in international mode)
- Deployment: Alibaba Cloud Model Studio API, supports integration with Qwen family multimodal capabilities
These specs enable efficient generation of high-quality videos from static inputs, leveraging Alibaba's infrastructure for scalable performance.
Things to be aware of
Things to Be Aware Of

Alibaba | HappyHorse 1.0 | Image to Video may struggle with overly complex input images containing fine details or crowds, leading to motion artifacts. Common mistakes include vague prompts lacking motion specifics, resulting in static-like outputs. High-duration requests (near 15 seconds) can introduce minor inconsistencies in physics. Ensure stable internet for API calls, as global scheduling excludes Chinese mainland. Test with simple scenes first to gauge performance.
Key considerations
Key Considerations

Before using Alibaba | HappyHorse 1.0 | Image to Video, ensure your input image is high-quality with clear subjects for optimal motion realism. It excels in scenarios requiring precise first-frame fidelity, unlike text-only video models. Users need an Alibaba Cloud account for Model Studio access, with international endpoints in Singapore for global data handling. Consider cost-effectiveness, as it pairs well with faster Qwen models for multimodal workflows. Best for short clips where physics-accurate motion matters over long-form content; evaluate API quotas for high-volume use.
Limitations
Limitations
Alibaba | HappyHorse 1.0 | Image to Video is capped at 15 seconds, unsuitable for longer narratives. It relies heavily on input image quality—low-res or blurry starts yield suboptimal motion. No support for audio output or advanced editing like inpainting. International mode limits data to Singapore region, potentially affecting latency. Complex multi-object interactions may not simulate perfectly.
---