Kling v2 Text to Video
Model Information
Input
Configure model parameters
Output
View generated results
Result
Preview, share or download your results with a single click.
Prerequisites
- Create an API Key from the Eachlabs Console
- Install the required dependencies for your chosen language (e.g., requests for Python)
API Integration Steps
1. Create a Prediction
Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.
import requestsimport timeAPI_KEY = "YOUR_API_KEY" # Replace with your API keyHEADERS = {"X-API-Key": API_KEY,"Content-Type": "application/json"}def create_prediction():response = requests.post("https://api.eachlabs.ai/v1/prediction/",headers=HEADERS,json={"model": "kling-v2-text-to-video","version": "0.0.1","input": {"cfg_scale": 0.5,"negative_prompt": "your negative prompt here","aspect_ratio": "16:9","duration": 5,"prompt": "your prompt here"},"webhook_url": ""})prediction = response.json()if prediction["status"] != "success":raise Exception(f"Prediction failed: {prediction}")return prediction["predictionID"]
2. Get Prediction Result
Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.
def get_prediction(prediction_id):while True:result = requests.get(f"https://api.eachlabs.ai/v1/prediction/{prediction_id}",headers=HEADERS).json()if result["status"] == "success":return resultelif result["status"] == "error":raise Exception(f"Prediction failed: {result}")time.sleep(1) # Wait before polling again
3. Complete Example
Here's a complete example that puts it all together, including error handling and result processing. This shows how to create a prediction and wait for the result in a production environment.
try:# Create predictionprediction_id = create_prediction()print(f"Prediction created: {prediction_id}")# Get resultresult = get_prediction(prediction_id)print(f"Output URL: {result['output']}")print(f"Processing time: {result['metrics']['predict_time']}s")except Exception as e:print(f"Error: {e}")
Additional Information
- The API uses a two-step process: create prediction and poll for results
- Response time: ~340 seconds
- Rate limit: 60 requests/minute
- Concurrent requests: 10 maximum
- Use long-polling to check prediction status until completion
Overview
Kling v2 Text to Video is a video generation model that converts text descriptions into short, high-quality video clips. Kling v2 Text to Video interprets descriptive prompts to produce realistic or stylized motion visuals based on the user's configurations. Designed for versatility, it supports aspect ratio customization, motion scaling, and prompt control options for targeted video outcomes.
Technical Specifications
- Always craft clear and descriptive prompts. Avoid ambiguous language.
- Use short, action-based phrases for better motion interpretation.
- Limit duration values to 5 or 10 seconds for consistent video quality.
- Balance CFG Scale values between 0.5 and 0.8 for natural prompt adherence without losing creativity.
- When possible, pair prompts with Negative Prompts to suppress unwanted details.
- The Aspect Ratio setting directly influences video framing and should match the intended display platform.
- Complex scenes may require simplified phrasing for smoother video generation.
Key Considerations
Kling v2 Text to Video does not support uploading images or videos as input sources.
Kling v2 Text to Video requires well-defined prompts for coherent motion sequences.
Overly complex or abstract prompts may result in less predictable outputs.
Video duration is strictly limited to either 5 or 10 seconds.
Aspect Ratio changes significantly affect composition; test different ratios for best framing.
CFG Scale influences creativity versus strict prompt fidelity — values above 0.8 can overly restrict motion diversity.
Legal Information
By using Kling v2 Text to Video model, you agree to:
- Kling Privacy
- Kling SERVICE AGREEMENT
Tips & Tricks
- Prompt: Keep language simple and direct. Use action verbs (e.g. "A cat jumping on a table"). Avoid vague terms.
-
Duration:
- Set to 5 seconds for quick, sharp motions.
- Set to 10 seconds for sequences needing room to develop visually.
-
Aspect Ratio:
- Use 16:9 for wide scenes like landscapes or multi-subject action.
- Use 9:16 for portrait or vertical video formats suitable for mobile content.
- Use 1:1 for social media square posts or focused subject shots.
-
CFG Scale:
- Recommended values: 0.5 to 0.8
- Lower values (0.5) allow more creative freedom and abstract interpretation.
- Higher values (0.8) enforce stricter alignment with the prompt description.
- Negative Prompt: Always fill this when specific unwanted elements are to be avoided (e.g., “blurry, distorted, low quality”).
Capabilities
Generates animated video content from text instructions.
Supports dynamic motion rendering based on descriptive language.
Handles multiple scene types: nature, objects, actions, characters.
Adaptable aspect ratios for different display needs.
Can exclude unwanted elements via negative prompts.
Balances prompt faithfulness and creative output with CFG scaling.
What can I use for?
Short promotional videos.
Concept visualization clips.
Quick content creation for social media.
Prototype video generation for design previews.
Visual storytelling based on text descriptions.
Character or scene animation based solely on narrative cues.
Things to be aware of
Test the same prompt across different Aspect Ratios to see framing impact.
Adjust CFG Scale incrementally to find the optimal creativity-control balance.
Use Negative Prompts to block artifacts like “blurry faces” or “oversaturated colors.”
Create action-based prompts (e.g. “a dog chasing a ball through a park”) for best motion results.
Combine abstract and literal terms (e.g. “a dreamy floating city at sunset”) for cinematic outputs.
Compare 5-second vs 10-second durations for pacing differences.
Limitations
No support for image or video input conditioning.
Maximum video duration is capped at 10 seconds.
Excessively detailed or long prompts might not translate well into coherent motion.
Limited control over fine-grain frame-by-frame content.
Higher CFG values may reduce creative variation.
Outputs may occasionally differ in style or detail intensity based on prompt phrasing.
Output Format: MP4