# Wan | v2.6 | Text to Video Wan 2.6 is a text-to-video model that generates high-quality videos with smooth motion and cinematic detail. ## API Information - **Model Slug:** wan-v2-6-text-to-video - **Branded URL:** https://www.eachlabs.ai/alibaba/wan-v2-6/wan-v2-6-text-to-video - **Provider:** Alibaba - **Category:** Text to Video - **Output Type:** video - **Status:** active - **Version:** 0.0.1 - **Base Cost:** 1080p pricing (fallback): $0.15/second - **Estimated Processing Time:** 270 seconds - **Last Updated:** 2026-05-07 - **Interactive Demo:** https://www.eachlabs.ai/ai-models/wan-v2-6-text-to-video ## Pricing - **Charge Type:** dynamic - **Estimated Price (default example):** $2.25 - **Pricing Details:** 1080p pricing (fallback): $0.15/second ### Pricing Rules | Condition | Pricing | | --- | --- | | resolution eq_i "720p" | 720p pricing: $0.10/second | | Rule 2 | 1080p pricing (fallback): $0.15/second | ## Input Schema | Parameter | Type | Required | Default | Constraints | Description | |-----------|------|----------|---------|-------------|-------------| | prompt | string | Yes | - | - | The text prompt for video generation. Supports Chinese and English, max 800 characters. For multi-shot videos, use format: 'Overall description. First shot [0-3s] content. Second shot [3-5s] content.' | | audio_url | string | No | - | - | URL of the audio to use as the background music. Must be publicly accessible. Limit handling: If the audio duration exceeds the duration value (5, 10, or 15 seconds), the audio is truncated to the first N seconds, and the rest is discarded. If the audio is shorter than the video, the remaining part of the video will be silent. For example, if the audio is 3 seconds long and the video duration is 5 seconds, the first 3 seconds of the output video will have sound, and the last 2 seconds will be silent. - Format: WAV, MP3. - Duration: 3 to 30 s. - File size: Up to 15 MB. | | aspect_ratio | string | No | 16:9 | 16:9,9:16,1:1,4:3,3:4 | The aspect ratio of the generated video. Wan 2.6 supports additional ratios. | | resolution | string | No | 1080p | 720p,1080p | Video resolution tier. Wan 2.6 T2V only supports 720p and 1080p (no 480p). | | duration | string | No | 5 | 5,10,15 | Duration of the generated video in seconds. Choose between 5, 10, or 15 seconds. | | negative_prompt | string | No | - | - | Negative prompt to describe content to avoid. Max 500 characters. | | enable_prompt_expansion | boolean | No | true | - | Whether to enable prompt rewriting using LLM. Improves results for short prompts but increases processing time. | | multi_shots | boolean | No | true | - | When true, enables intelligent multi-shot segmentation for coherent narrative videos. Only active when enable_prompt_expansion is True. Set to false for single-shot generation. | | seed | integer | No | - | - | Random seed for reproducibility. If None, a random seed is chosen. | | enable_safety_checker | boolean | No | true | - | If set to true, the safety checker will be enabled. | ## Example Request ```bash curl -X POST https://api.eachlabs.ai/v1/prediction/ \ -H "X-API-Key: YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "wan-v2-6-text-to-video", "input": { "prompt": "Humorous but Premium Mini-Trailer\n\nConcept: A tiny porcelain robot curator controls reality like a film set.\n\nVisual style (global):\nExtreme photoreal 4K, cinematic lighting, shallow depth of field, subtle film grain, smooth stabilized camera, premium VFX realism.\n\nShot 1 — [0–3s]\n\nScene:\nMacro close-up of a small porcelain robot curator.\nIt gently taps a miniature tuning fork engraved with “eachlabs”.\nSoft reverb in the air.\n\nDialogue (natural, calm):\n\n“Begin.”\n\nShot 2 — [3–6s]\n\nHard cut:\nA vast Arctic ice plain under pale blue sky. Wind blows snow across the ground.\nWide cinematic shot.\nThe robot stands in the foreground, tiny against the scale.\n\nIt slightly turns its head.\n\nDialogue:\n\n“More space.”\n\n(The horizon stretches wider, camera pulls back.)\n\nShot 3 — [6–10s]\n\nHard cut:\nA glowing underwater coral canyon. Sun rays penetrate the water, particles floating.\nThe robot calmly walks along the seabed, unaffected by water.\nCamera slowly tracks forward between corals and fish.\n\nDialogue (soft, curious):\n\n“Let it breathe.”\n\nShot 4 — [10–15s]\n\nHard cut:\nA silent lunar surface at dawn. Earth rising in the background.\nSlow orbital camera move around the robot as it looks at the horizon.\n\nIt nods once.\n\nDialogue:\n\n“Ready for the next reality.”" } }' ``` ## Output Schema Response returned by `GET /v1/prediction/{id}` when the job completes: ```json { "status": "success", "predictionID": "string", "output": "string (URL of generated video)", "metrics": { "predict_time": "number (seconds)" } } ``` ## Polling ```bash curl https://api.eachlabs.ai/v1/prediction/{PREDICTION_ID} \ -H "X-API-Key: YOUR_API_KEY" ``` | Status | Meaning | |--------|---------| | `processing` | Still running — poll again | | `success` | Done — read `output` | | `error` | Failed — read `message` / `details` | ## Webhook (alternative to polling) Pass `"webhook_url": "https://your.host/path"` in the create request. Eachlabs POSTs this payload when the job ends: ```json { "exec_id": "prediction-uuid", "status": "succeeded", "output": "https://...", "error": "" } ``` `status` is `"succeeded"` or `"failed"`. `exec_id` equals the `predictionID` from create. Return 2xx within 30 seconds. ## Errors Error body: `{ "status": "error", "message": "...", "details": "..." }` | Code | Meaning | |------|---------| | `400` | Invalid input | | `401` | Missing / invalid `X-API-Key` | | `404` | Unknown model or prediction id | | `429` | Rate limit — 100 creates / min, 10 concurrent per key | | `5xx` | Retry with backoff | ## Overview **wan-v2.6-text-to-video — Text to Video AI Model** Developed by Alibaba as part of the **wan-v2.6** family, **wan-v2.6-text-to-video** is a cutting-edge text-to-video AI model that transforms text prompts into cinematic multi-shot videos up to 15 seconds long with synchronized audio. This Alibaba text-to-video solution excels in generating coherent narratives with smooth transitions, character stability, and professional camera control, solving the challenge of creating high-quality short-form video content without extensive editing. Ideal for developers seeking a **text-to-video AI model** with multi-shot capabilities, it supports 720p and 1080p resolutions at 30 fps in MP4 format, delivering polished outputs for commercial use. ## Usage Notes - API Base URL: `https://api.eachlabs.ai/v1` - Authentication: send `X-API-Key: YOUR_API_KEY`. Generate a key from the Eachlabs dashboard at https://www.eachlabs.ai/dashboard/api-keys. - File-typed parameters (`*_url`, `image_url`, `video_url`, `audio_url`, etc.) accept publicly-reachable HTTPS URLs only. Upload your asset first (GCS / S3 / your CDN) and pass the resulting URL. Data-URIs and localhost URLs are rejected. - For structured parameters (arrays / objects) send real JSON values, not stringified payloads. - Monetary values are reported in USD; per-token / per-megapixel rates may be billed in micro-cents internally. - Prefer `webhook_url` over polling for long-running predictions — see the Webhook Callback section.