Pyramid Flow

pyramid-flow

Code of Pyramidal Flow Matching for Efficient Video Generative Modeling

A100 80GB
Fast Inference
REST API

Model Information

Response Time~276 sec
StatusActive
Version
0.0.1
Updated13 days ago

Prerequisites

  • Create an API Key from the Eachlabs Console
  • Install the required dependencies for your chosen language (e.g., requests for Python)

API Integration Steps

1. Create a Prediction

Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.

import requests
import time
API_KEY = "YOUR_API_KEY" # Replace with your API key
HEADERS = {
"X-API-Key": API_KEY,
"Content-Type": "application/json"
}
def create_prediction():
response = requests.post(
"https://api.eachlabs.ai/v1/prediction/",
headers=HEADERS,
json={
"model": "pyramid-flow",
"version": "0.0.1",
"input": {
"image": "your_file.image/jpeg",
"prompt": "your prompt here",
"duration": "5",
"guidance_scale": "9",
"frames_per_second": "8",
"video_guidance_scale": "5"
}
}
)
prediction = response.json()
if prediction["status"] != "success":
raise Exception(f"Prediction failed: {prediction}")
return prediction["predictionID"]

2. Get Prediction Result

Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.

def get_prediction(prediction_id):
while True:
result = requests.get(
f"https://api.eachlabs.ai/v1/prediction/{prediction_id}",
headers=HEADERS
).json()
if result["status"] == "success":
return result
elif result["status"] == "error":
raise Exception(f"Prediction failed: {result}")
time.sleep(1) # Wait before polling again

3. Complete Example

Here's a complete example that puts it all together, including error handling and result processing. This shows how to create a prediction and wait for the result in a production environment.

try:
# Create prediction
prediction_id = create_prediction()
print(f"Prediction created: {prediction_id}")
# Get result
result = get_prediction(prediction_id)
print(f"Output URL: {result['output']}")
print(f"Processing time: {result['metrics']['predict_time']}s")
except Exception as e:
print(f"Error: {e}")

Additional Information

  • The API uses a two-step process: create prediction and poll for results
  • Response time: ~276 seconds
  • Rate limit: 60 requests/minute
  • Concurrent requests: 10 maximum
  • Use long-polling to check prediction status until completion

Overview

The Pyramid Flow model is designed for efficient video generation, enabling both text-to-video and image-to-video synthesis. By leveraging pyramidal flow matching techniques, it captures temporal dynamics effectively, producing coherent and high-quality video outputs.

Technical Specifications

Pyramidal Flow Matching: Utilizes a hierarchical approach to model temporal dependencies efficiently.

Text-to-Video and Image-to-Video Generation: Supports both modalities for versatile content creation.

Temporal Dynamics Capture: Effectively models motion and scene transitions for realistic video outputs.

Key Considerations

The quality of the generated video is highly dependent on the clarity and relevance of the input prompts and images.

Longer durations may require more computational resources and could affect the coherence of the video.

Balancing the guidance scales is crucial to achieve the desired influence of text and image inputs on the final output

Tips & Tricks

Prompts for Pyramid Flow: Craft detailed and specific descriptions to guide the video content effectively.

Image: Use high-quality images that closely relate to the desired video theme to enhance visual coherence.

Duration: For concise content, set durations between 1 to 5 seconds; for more elaborate scenes, consider 6 to 10 seconds.

Guidance Scale: A value between 5 to 10 is recommended to balance adherence to the prompt without overwhelming the Pyramid Flow creativity.

Video Guidance Scale: Setting this between 5 to 10 helps maintain consistency with the provided image while allowing for dynamic content generation.

Frames Per Second: A frame rate of 24 fps is standard for smooth motion; however, for a more cinematic feel, 8 fps can be used.

Capabilities

Text-to-Video Generation with Pyramid Flow : Converts textual descriptions into dynamic video content.

Image-to-Video Generation: Transforms static images into animated sequences, guided by the provided image and optional text prompts.

Temporal Consistency: Maintains coherent motion and scene transitions across frames.

What can I use for?

Content Creation with Pyramid Flow: Generate short videos for social media, marketing, or educational purposes based on textual or visual inputs.

Creative Projects: Explore artistic expressions by transforming images or text into animated visuals.

Prototyping: Quickly visualize concepts or storyboards without the need for extensive video production resources.

Things to be aware of

Experiment with different combinations of text prompts and images to discover unique video outputs.

Adjust the guidance scales to see how the influence of text and image inputs affects the generated content.

Vary the duration and frames per second to create videos with different pacing and styles.

Limitations

The Pyramid Flow may struggle with highly complex scenes or prompts that require intricate temporal dynamics.

There is a possibility of artifacts or inconsistencies in longer videos due to the challenges in maintaining coherence over extended durations.

The generated videos are limited by the diversity and quality of the data the Pyramid Flow was trained on.

Output Format:MP4

Related AI Models

lcm-animation

LCM animation Time Lapse Generator

lcm-animation

Text to Video
ltx-video

LTX-Video

ltx-video

Text to Video
wan-2.1-1.3b

Wan 2.1-1.3B

wan-2-1-1-3b

Text to Video
mochi-1

Mochi-1

mochi-1

Text to Video