Kling v1.6 Standard Text to Video

Fast Inference

REST API

Model Information

Response Time:~230 sec

Status:Active

Version:

0.0.1

Updated:17 days ago

kling-v1-6-standard-text-to-video

Live Demo

Average runtime: ~230 seconds

Input

Configure model parameters

Prompt

A lone traveler in a flowing cloak steps onto the edge of a windswept cliff at dawn, the camera slowly pulls back to reveal a vast, mist-filled valley below. Soft golden light breaks through the fog, illuminating floating embers drifting across the scene as orchestral strings swell in the background.

Duration

The duration of the generated video in seconds

Output

View generated results

Result

Preview, share or download your results with a single click.

Each execution costs $0.28 With $1 you can run this model about 3 times.

API Reference

View Full Documentation

Prerequisites

Create an API Key from the Eachlabs Console
Install the required dependencies for your chosen language (e.g., requests for Python)

API Integration Steps

1. Create a Prediction

Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.

import requests
import time

API_KEY = "YOUR_API_KEY"  # Replace with your API key
HEADERS = {
    "X-API-Key": API_KEY,
    "Content-Type": "application/json"
}

def create_prediction():
    response = requests.post(
        "https://api.eachlabs.ai/v1/prediction/",
        headers=HEADERS,
        json={
            "model": "kling-v1-6-standard-text-to-video",
            "version": "0.0.1",
            "input": {
  "cfg_scale": 0.5,
  "negative_prompt": "blur, distort, and low quality",
  "aspect_ratio": "16:9",
  "duration": "5",
  "prompt": "your prompt here"
},
            "webhook_url": ""
        }
    )
    prediction = response.json()
    
    if prediction["status"] != "success":
        raise Exception(f"Prediction failed: {prediction}")
    
    return prediction["predictionID"]

2. Get Prediction Result

Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.

def get_prediction(prediction_id):
    while True:
        result = requests.get(
            f"https://api.eachlabs.ai/v1/prediction/{prediction_id}",
            headers=HEADERS
        ).json()
        
        if result["status"] == "success":
            return result
        elif result["status"] == "error":
            raise Exception(f"Prediction failed: {result}")
        
        time.sleep(1)  # Wait before polling again

3. Complete Example

Here's a complete example that puts it all together, including error handling and result processing. This shows how to create a prediction and wait for the result in a production environment.

try:
    # Create prediction
    prediction_id = create_prediction()
    print(f"Prediction created: {prediction_id}")
    
    # Get result
    result = get_prediction(prediction_id)
    print(f"Output URL: {result['output']}")
    print(f"Processing time: {result['metrics']['predict_time']}s")
except Exception as e:
    print(f"Error: {e}")

Additional Information

The API uses a two-step process: create prediction and poll for results
Response time: ~230 seconds
Rate limit: 60 requests/minute
Concurrent requests: 10 maximum
Use long-polling to check prediction status until completion

Overview

Kling v1.6 Standard Text to Video is a prompt-based video generation model designed to transform short, descriptive texts into cinematic video sequences. It accepts natural language prompts and outputs short-form videos with coherent motion, structure, and visual consistency. Kling v1.6 supports a range of durations and aspect ratios, and can be fine-tuned using guidance scale and negative prompts to suppress unwanted elements.

Technical Specifications

Video output is produced in a temporally consistent format, with stable subject and camera movement.

Frame quality is designed for standard-definition and social-friendly use cases.

Supports real-world lighting, physics-aware animations, and realistic environments.

Includes built-in prompt interpretation for understanding spatial and narrative cues.

Supports latent motion planning and coherence across frames without frame duplication.

Processes text input without relying on reference imagery.

Key Considerations

Kling v1.6 is not designed for photorealistic close-ups of faces or detailed typography.

Prompt length affects output; overly long prompts may confuse motion generation.

Using both prompt and negative_prompt together leads to more precise control.

Video resolution and quality are internally managed and not user-configurable.

Scene complexity should be balanced — single or dual subjects work best.

Repetitive or looping elements may occur if prompt is ambiguous or too abstract

Legal Information for Kling v1.6 Standart Text to Video

By using this Kling v1.6 Standart Text to Video, you agree to:

Kling Privacy
Kling SERVICE AGREEMENT

Tips & Tricks

prompt

Be specific: "A futuristic city with flying cars at sunset" is better than "futuristic scene."
Include scene descriptors (time of day, environment, movement).
Include a main subject and its action, e.g., "A robot walking through a desert storm."

negative_prompt

Use to avoid styles, elements, or actions. Example: "blurry, distorted, low-quality, cartoonish."
Helps refine output by removing undesirable content or visual artifacts.
Best used when Kling v1.6 Standart Text to Video repeatedly generates unwanted elements.

cfg_scale (0–1)

Controls how strongly Kling v1.6 Standart Text to Video follows your prompt.
- 0.2–0.4: More creative freedom, unexpected results.
- 0.5–0.7: Balanced output, coherent motion, flexible interpretation.
- 0.8–1.0: Strict adherence to the prompt, useful for structured scenes.
Recommended: Start with 0.6, adjust based on prompt complexity.

aspect_ratio

16:9 – Landscape view, best for cinematic or environmental scenes.
9:16 – Vertical framing, ideal for mobile-first platforms and portrait compositions.
1:1 – Square format, good for minimal motion or central subject scenes.
Choose aspect ratio to match platform and subject positioning.

duration

5 seconds – Best for quick visuals, close-ups, and simple motions.
10 seconds – Allows more camera movement and storytelling.
For detailed motion or subject transformation, prefer 10 seconds.

Capabilities

Generates coherent short videos from descriptive text.

Creates motion, camera panning, zoom, and environment depth automatically.

Adapts to different aspect ratios and durations for flexible use.

Supports negative prompts for better control over unwanted outputs.

Maintains stable subject appearance across frames.

What can I use for?

Generating visual content for creative projects using descriptive narration.

Creating animated concept visuals for environments, characters, or scenes.

Producing short narrative sequences for social platforms.

Exploring motion design ideas from a written idea without needing a reference image.

Things to be aware of

Create a dynamic cityscape with motion:
"A neon-lit cyberpunk street with people walking in the rain, night time."
Test stylized visual storytelling:
"A giant bird flying over a canyon during sunrise, magical atmosphere."
Explore cinematic language:
"Camera slowly zooms into an old lighthouse by the stormy sea, dark clouds moving."
Add realism by specifying physics:
"Wind blowing through wheat fields, golden hour lighting."

Limitations

Fine-grained facial expressions, text overlays, or logos are not reliably rendered.

May hallucinate details if prompts are too vague or overloaded.

Not designed for lip-sync, audio alignment, or speech-based output.

Repetitive patterns may occur if subject movement is not clearly defined.

Does not support input images; prompt-only model.

Output Format: MP4

Related AI Models

You can seamlessly integrate advanced AI capabilities into your applications without the hassle of managing complex infrastructure.

Eachlabs | AI Workflows for app builders

Kling v1 Pro Text to Video

Kling v1 Pro Text to Video converts written text into high-quality videos with stable and consistent results.

Vimmerse Story

Vimmerse AI transforms static images into dynamic, animated videos using advanced motion and visual effects. It's designed to create engaging content for social media, marketing, and digital storytelling in just seconds.

Minimax Hailuo V1 Director | Text to Video

Hailuo T2V Director is an AI video model that supports a wide range of artistic styles and is designed to revolutionize how 2D illustrations come to life.

Kling v1.5 Pro Text-to-Video

Text transforms into well-structured, high-quality videos using Kling v1.5 Pro Text-to-Video, optimized for professional results.