Kling v1 Pro Text to Video
kling-v1-pro-text-to-video
Kling v1 Pro Text to Video converts written text into high-quality videos with stable and consistent results.
Model Information
Input
Configure model parameters
Output
View generated results
Result
Preview, share or download your results with a single click.
Prerequisites
- Create an API Key from the Eachlabs Console
- Install the required dependencies for your chosen language (e.g., requests for Python)
API Integration Steps
1. Create a Prediction
Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.
import requestsimport timeAPI_KEY = "YOUR_API_KEY" # Replace with your API keyHEADERS = {"X-API-Key": API_KEY,"Content-Type": "application/json"}def create_prediction():response = requests.post("https://api.eachlabs.ai/v1/prediction/",headers=HEADERS,json={"model": "kling-v1-pro-text-to-video","version": "0.0.1","input": {"cfg_scale": 0.5,"negative_prompt": "blur, distort, and low quality","aspect_ratio": "16:9","duration": 5,"prompt": "your prompt here"},"webhook_url": ""})prediction = response.json()if prediction["status"] != "success":raise Exception(f"Prediction failed: {prediction}")return prediction["predictionID"]
2. Get Prediction Result
Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.
def get_prediction(prediction_id):while True:result = requests.get(f"https://api.eachlabs.ai/v1/prediction/{prediction_id}",headers=HEADERS).json()if result["status"] == "success":return resultelif result["status"] == "error":raise Exception(f"Prediction failed: {result}")time.sleep(1) # Wait before polling again
3. Complete Example
Here's a complete example that puts it all together, including error handling and result processing. This shows how to create a prediction and wait for the result in a production environment.
try:# Create predictionprediction_id = create_prediction()print(f"Prediction created: {prediction_id}")# Get resultresult = get_prediction(prediction_id)print(f"Output URL: {result['output']}")print(f"Processing time: {result['metrics']['predict_time']}s")except Exception as e:print(f"Error: {e}")
Additional Information
- The API uses a two-step process: create prediction and poll for results
- Response time: ~220 seconds
- Rate limit: 60 requests/minute
- Concurrent requests: 10 maximum
- Use long-polling to check prediction status until completion
Overview
Kling v1 Pro Text to Video is a generative video model designed to convert natural language descriptions into coherent short video clips. It allows users to define the duration, aspect ratio, and visual elements of the resulting video using a prompt-based interface. The model focuses on temporal coherence, smooth motion, and accurate representation of described scenes.
Technical Specifications
Kling v1 Pro Text to Video uses a diffusion-based video generation framework optimized for short-form synthesis.
Video generation maintains temporal consistency with keyframe stabilization over multiple frames.
Model is optimized for rendering fluid motion, camera stability, and visual fidelity in 1–3 second sequences.
Kling v1 Pro Text to Video supports both horizontal (16:9) and vertical (9:16) outputs, with internal frame interpolation to maintain frame smoothness.
Model supports inference with natural language in English and can recognize various object classes, environments, and actions.
Key Considerations
Prompts must be concise and direct. Overly long or poetic descriptions may lead to abstract or distorted results.
Video outputs are limited to predefined durations (5 or 10 seconds) and cannot be extended beyond this range.
Kling v1 Pro Text to Video is not intended for use cases requiring facial accuracy, lip synchronization, or dialogue.
Adding a negative prompt can improve results by removing unwanted elements such as distortions or unwanted objects.
Output resolution and frame rate are fixed and cannot be customized at this stage.
Legal Information for Kling v1 Pro Text to Video
By using this Kling v1 Pro Text to Video, you agree to:
- Kling Privacy
- Kling SERVICE AGREEMENT
Tips & Tricks
- Prompt: Use visually rich but concise language. Example:
“A futuristic city skyline at sunset with flying cars”
Avoid: “The most amazing futuristic scene ever imagined”
✔️ Include lighting conditions, objects, actions, and style (e.g., realistic, cinematic).
✖️ Avoid vague adjectives without context. -
CFG Scale (0–1):
- Values around 0.7–0.9 are optimal for balancing prompt fidelity with creativity.
- Lower values (0.3–0.6) may yield more abstract or loosely interpreted results.
- Higher values (close to 1.0) generate literal interpretations but may reduce visual diversity.
-
Negative Prompt: Use this to suppress unwanted elements.
Example: “blurry, distorted, out of frame” can help refine output. -
Aspect Ratio:
- 16:9: Ideal for web or desktop use.
- 9:16: Best for mobile or social media visuals.
- 1:1: Suitable for avatars or square-format content.
-
Duration:
- 5: Quick preview or short scene. Faster rendering.
- 10: Longer scene with more motion; may contain more content variation.
Capabilities
enerates short-form video clips from English-language text prompts.
Supports basic scene animation such as object motion, environment panning, and atmospheric changes.
Maintains temporal consistency for subjects in motion across frames.
Compatible with various prompt styles, including cinematic, realistic, abstract, or stylized.
Allows suppression of unwanted visual elements through negative prompts.
What can I use for?
Creating visual concepts or mood boards from text.
Visualizing creative ideas for short video formats.
Designing social media visuals or visual references for design and storytelling.
Rapid prototyping of motion scenes for creative projects or pitch decks.
Things to be aware of
Try describing an action paired with an environment:
"A robot walking through a neon-lit alley at night"
Experiment with negative prompts to reduce common issues like blur:
"blurry, low contrast, disfigured"
Test different aspect ratios for different publishing formats.
"16:9" for widescreen, "9:16" for vertical video.
Limitations
Does not support text overlays or subtitles within generated video.
Faces, fine object details, or small text elements may appear distorted.
No direct control over background music, audio, or frame rate.
Cannot depict complex multi-shot storytelling or scene transitions.
Lighting and color rendering may vary across outputs.
Output Format: MP4