SANA
Sana, a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution
Avg Run Time: 1.000s
Model Slug: sana
Playground
Input
Output
Example Result
Preview and download your result.

API & SDK
Create a Prediction
Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.
Get Prediction Result
Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.
Readme
Overview
Sana by Nvidia is designed for generating high-quality images based on detailed textual prompts. With a focus on flexibility and precision, it supports advanced customization through adjustable parameters. Whether you're creating artistic visuals, concept art, or professional imagery, this model provides the model to bring your ideas to life.
Technical Specifications
- Text-to-Image Capability: Generates realistic and artistic images from detailed textual descriptions.
- Negative Prompting: Allows precise control over unwanted elements in the output.
- Configurable Parameters: Provides extensive options to fine-tune outputs based on user preferences.
Key Considerations
- Resolution and Performance: Higher resolutions (width and height) increase processing time; balance quality with performance needs.
- Prompt Length: Overly long prompts may dilute the model’s focus. Stick to succinct, targeted descriptions.
- Guidance Scale Balance: Excessive values for guidance_scale or pag_guidance_scale might lead to unnatural or overemphasized elements.
- Seed for Reproducibility: Use the same seed value to regenerate identical results.
Tips & Tricks
- Refine Your Prompt: Test variations of your description to discover the best phrasing for your desired output.
- Negative Prompt Efficiency: Use negative_prompt to filter out undesired elements and focus on key details.
- Guidance Scale: Start with moderate values between 8–12 for balanced outputs. For more creative or artistic results, experiment with higher values like 15–18. Use lower values (e.g., 5–7) for a subtler influence on the output. Avoid extreme values unless specific effects are desired, as they may lead to unnatural results.
- Inference Steps: For quick previews, use values between 10–20 to get a sense of the output without long processing times. For detailed and high-quality outputs, use 30–50 steps. Avoid going beyond 60, as the improvements often diminish while processing time increases significantly.
- Seed Control: Reuse specific seed values to reproduce consistent results for iterative projects.
Pag Guidance Scale: Use values between 10–14 to subtly enhance the structure or style of the output. For stronger stylistic influence, increase to 15–18, and for a lighter touch, experiment with 6–9. Avoid values below 5, as they may not have a noticeable impact on the results.Combining guidance_scale at 10–12 and pag_guidance_scale at 12–15 often provides a harmonious balance between adherence to the prompt and artistic styling.
Capabilities
- Creates stunning, high-resolution images from textual descriptions.
- Supports detailed customization through multiple adjustable parameters.
- Enables repeatable results using the seed parameter.
What Can I Use It For?
- Artistic Creations: Generate concept art, illustrations, or unique designs.
- Professional Projects: Design marketing visuals, product mockups, or presentation materials.
- Creative Exploration: Experiment with prompts to explore new artistic styles and ideas.
Things to Be Aware Of
- Detailed Scenes: Describe intricate settings (e.g., "a bustling city at night with neon signs and rain-soaked streets").
- Negative Refinements: Use negative_prompt to avoid unwanted elements (e.g., "no haze, no people").
- High-Quality Outputs: Increase num_inference_steps for sharper, more polished images.
- Consistent Themes: Reuse seed values to maintain a consistent style across multiple outputs.
- Creative Styles: Experiment with guidance_scale to explore different levels of prompt adherence and artistic influence.
Limitations
- Abstract Concepts: May struggle to interpret highly abstract or ambiguous prompts.
- Processing Time: High-resolution images or extensive steps can lead to longer generation times.
- Prompt Sensitivity: Minor changes in wording can significantly impact results.
Output Format: PNG
Pricing
Pricing Detail
This model runs at a cost of $0.001677 per second.
The average execution time is 1 seconds, but this may vary depending on your input data.
The average cost per run is $0.001677
Pricing Type: Execution Time
Cost Per Second means the total cost is calculated based on how long the model runs. Instead of paying a fixed fee per run, you are charged for every second the model is actively processing. This pricing method provides flexibility, especially for models with variable execution times, because you only pay for the actual time used.
Related AI Models
You can seamlessly integrate advanced AI capabilities into your applications without the hassle of managing complex infrastructure.
