Mureka · Create Speech

Object·mureka·by Mureka

Mureka Create Speech is a text-to-speech model that converts written input into natural-sounding spoken audio.

Runtime (p50)
10s
Estimated price
$0.001361 / sec
Call the API
prediction.sh
sh
curl -X POST \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "mureka-create-speech",
    "version": "0.0.1",
    "input": {
        "voice": "Victoria",
        "text": "Today is about progress, not perfection. Every small step we take, every idea we try, brings us closer to something better. What matters most is staying curious, staying brave, and continuing to build, even when the path isn’t clear yet."
    },
    "webhook_url": ""
}' \
  https://api.eachlabs.ai/v1/prediction/
Documentation8 sections
  • Overview

    mureka-create-speech — Music Generation AI Model

    Developed by Mureka as part of the mureka family, mureka-create-speech is a specialized text-to-speech model that transforms written text into natural-sounding spoken audio with expressive vocal synthesis, ideal for music production and content creation. This music-generation AI model stands out by integrating high-quality speech synthesis directly into musical compositions, enabling seamless vocal tracks from simple prompts. Users searching for "Mureka music-generation" or "text-to-speech for music AI" find mureka-create-speech excels in generating realistic vocals across 10+ languages, perfect for creators needing professional audio without recording studios.

  • Capabilities
    • Generates natural-sounding speech from text inputs with high fidelity
    • Supports voice cloning from user-uploaded MP3 samples for custom voices
    • Handles multi-speaker conversations for dynamic dialogue synthesis
    • Lists and manages voices (GET speech/voices), including cloned ones
    • Retrieves (GET speech) and deletes (DELETE speech) generated recordings for workflow efficiency
    • No rate limits enable unlimited parallel speech generations
    • Integrates with broader audio tools, supporting extended character limits for complex prompts
  • Use cases

    Use Cases for mureka-create-speech

    Music producers building tracks with custom vocals can input lyrics tagged like "[Verse] Whispered intro in French about city nights" to generate expressive spoken elements that blend into indie electronica, saving hours on vocal recording.

    Content creators targeting global audiences use voice cloning for "text-to-speech music AI" to dub podcasts into Japanese or Spanish, maintaining the original speaker's timbre while adding background melodies via Mureka's platform.

    Developers integrating "Mureka music-generation API" into apps feed user text prompts for dynamic audio narration in games, like "Energetic English voiceover for level-up achievement with upbeat synths," outputting editable stems for real-time customization.

    Marketers crafting ads leverage multi-voice support for duets, such as generating a dialogue between two cloned voices in a 30-second jingle, ensuring natural flow across languages for international campaigns.

  • Tips & tricks

    How to Use mureka-create-speech on Eachlabs

    Access mureka-create-speech through Eachlabs' Playground for instant testing, API for scalable integrations, or SDK for app development—input text prompts, lyrics, language selection, or voice cloning references to generate high-fidelity WAV/MP3 audio with stems in seconds. Eachlabs delivers professional-grade outputs optimized for music workflows.

    ---
  • Technical spec

    What Sets mureka-create-speech Apart

    mureka-create-speech differentiates itself in the music-generation landscape through its advanced vocal synthesis tied to Mureka's MusiCoT architecture, which ensures coherent speech integration into songs rather than isolated audio clips. This enables producers to craft full tracks where spoken elements align perfectly with melodies and rhythms, a precision many competitors lack.

    • Voice Cloning and Expressive Modeling: Clone any voice with consent verification to create personalized singers, producing natural performances beyond robotic outputs; this allows consistent branding for music projects or custom AI vocalists in tracks.
    • Multi-Language Vocal Support: Handles 10+ languages including English, Chinese, Japanese, and more with high fidelity; users can generate global content like multilingual song intros without quality loss.
    • Stem Export Integration: Outputs separated vocal stems compatible with DAWs for professional mixing; this streamlines workflows for "Mureka music-generation API" users editing speech in complex arrangements.

    Technical specs include support for MP3/WAV formats, generation times under a minute, and integration with multi-modal inputs like lyrics or humming for enhanced speech-to-music flows.

  • Things to be aware of
    • Experimental features: Initial TTS endpoints added in August 2025, with rapid updates like custom voice uploads in January 2025
    • Known quirks: Related music endpoints now default to V7.5, suggesting similar model evolution for speech consistency
    • Performance considerations: Unlimited parallel generations make it suitable for high-volume use without throttling
    • Resource requirements: MP3 uploads for voices; standard API handling for generations
    • Consistency factors: Improved character limits (e.g., 1000 for prompts) reduce truncation in longer texts
    • Positive user feedback themes: Appreciation for no rate limits and voice cloning ease in API changelogs and docs
    • Common concerns: Ensure proper error handling (e.g., 429 in music endpoints, though not for speech)
  • Key considerations
    • Ensure audio samples for voice cloning are high-quality MP3 files to achieve optimal voice fidelity
    • Best practices include using the POST speech endpoint for generation and GET speech/voices to manage cloned voices effectively
    • Common pitfalls: Forgetting to handle 429 error documentation in related endpoints, though speech generation specifically has no rate limits
    • Quality vs speed trade-offs: Model supports parallel generations without limits, prioritizing accessibility over strict throttling
    • Prompt engineering tips: Leverage multi-speaker conversation prompts for natural dialogues; specify voice details via cloned voices list
  • Limitations
    • Specific architectural details and parameter counts not publicly available, limiting deep technical analysis
    • Primarily API-focused, with capabilities tied to endpoint usage rather than standalone model access
    • Lacks detailed benchmarks or comparisons in available sources, focusing instead on feature updates

Related models

4 models
* FAQ

About Mureka · Create Speech

01 / 03

What is Mureka Create Speech?

Mureka Create Speech is an AI voice synthesis model by Mureka that converts text into expressive, natural-sounding speech. It offers voice customization options including tone, pace, and style, making it suitable for narration, voiceovers, and audio content production.