Eachlabs | AI Workflows for app builders
mureka-generate-track-stem-generation

MUREKA

Generates individual audio stems from any track, including vocals, instrumentals, and specific instruments such as drums, bass, guitar, strings, and more.

Avg Run Time: 60.000s

Model Slug: mureka-generate-track-stem-generation

Playground

Input

Output

Example Result

Preview and download your result.

Track generation: $0.09 per track (Mureka V8 list). Applies to all 12 generate_type variants.. Cost per execution: $0.0900

API & SDK

Snippets reference the EACHLABS_API_KEY environment variable. Copy your real API key from /api-keys and set it locally before running.

Create a Prediction

Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.

Get Prediction Result

Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.

Readme

Table of Contents
Overview
Technical Specifications
Key Considerations
Tips & Tricks
Capabilities
What Can I Use It For?
Things to Be Aware Of
Limitations

Overview

Mureka | Generate Track | Stem Generation Overview

Mureka | Generate Track | Stem Generation is a specialized AI model that isolates and generates individual musical stems—vocals, instruments, or specific instrument tracks—from existing audio sources or complete songs. Built on Mureka's V8 architecture, this model solves a critical problem for music producers, content creators, and sound engineers: the ability to extract, remix, or regenerate specific instrumental or vocal components without access to the original multitrack files. Unlike generic music generation tools, Mureka | Generate Track | Stem Generation focuses on stem-level precision, allowing users to generate Vocals, Drums, Bass, Guitar, Keyboard, Percussion, Strings, Synth, FX, Brass, or Woodwinds as isolated tracks. This capability is essential for remixing, music production, podcast audio enhancement, and adaptive soundtrack creation where individual track control is non-negotiable.

Technical Specifications

Technical Specifications
  • Input formats: Audio file upload (via upload_audio_id) or reference song ID (via song_id from Mureka's song-generate endpoint)
  • Output: Individual stem files in standard audio formats
  • Processing time: 30–90 seconds per stem generation call
  • Stem types supported: Vocals, Instrumental, Drums, Bass, Guitar, Keyboard, Percussion, Strings, Synth, FX, Brass, Woodwinds
  • Required parameters: Exactly one of song_id or upload_audio_id (not both, not neither); style prompt; lyrics (required for Vocals only); optional vocal_gender specification
  • Optional parameters: generate_start and generate_end (milliseconds) to narrow the musical context window
  • Model version: V8 architecture

Key Considerations

Key Considerations

Before using Mureka | Generate Track | Stem Generation, understand that this model requires either an existing audio file or a previously generated song from Mureka's ecosystem—it cannot generate stems from scratch without a source reference. Processing time scales with audio length and stem complexity, typically ranging 30–90 seconds per call, so batch operations should account for cumulative latency. The model performs best when provided with clear style prompts that match the musical context of the source material. For vocal stem generation, lyrics are mandatory and must align with the desired output quality. The optional time-window parameters (generate_start/generate_end) are valuable for focusing the model on specific song sections, reducing processing overhead and improving precision on targeted passages.

Tips & Tricks

Tips and Tricks

To maximize Mureka | Generate Track | Stem Generation output quality, craft style prompts that describe both the instrumental character and emotional tone—for example, "bright, energetic pop drums with tight snare" rather than just "drums." When generating vocals, provide lyrics that match the melodic phrasing of your reference track; misaligned lyrics often produce timing artifacts. Use the generate_start and generate_end parameters strategically: isolate 8–16 bar sections for high-precision stem extraction, then stitch results together for full-track output. For instrumental stems, specify instrumentation details: "warm, vintage bass with slight saturation" yields more consistent results than generic requests. Test with shorter audio clips (15–30 seconds) first to validate style prompts before committing to full-track processing. Example prompts: "isolated vocal stem, intimate and breathy, indie-pop style," "punchy kick drum with sub-bass presence, electronic production," and "fingerpicked acoustic guitar, warm and resonant."

Capabilities

Capabilities
  • Generate isolated vocal stems with customizable vocal gender and lyrical content
  • Extract or regenerate individual instrument stems (Drums, Bass, Guitar, Keyboard, Percussion, Strings, Synth, FX, Brass, Woodwinds) from reference audio
  • Apply style prompts to shape the sonic character of generated stems
  • Narrow processing focus using millisecond-precision time windows (generate_start/generate_end)
  • Accept audio input via direct file upload or reference to previously generated Mureka songs
  • Maintain commercial rights to all generated stems for production and distribution
  • Support multitrack workflows by generating multiple stems sequentially from a single source

What Can I Use It For?

Use Cases for Mureka | Generate Track | Stem Generation

Music Producer Remixing: A producer receives a finished master but needs individual stems for remixing. Using Mureka | Generate Track | Stem Generation, they upload the master and regenerate isolated drum, bass, and vocal stems with custom style adjustments—"tight, modern trap drums" or "warm, vintage bass"—enabling creative reinterpretation without original session files.

Podcast and Content Creator Audio Enhancement: A podcaster wants to isolate dialogue from background music in an intro segment. They use Mureka | Generate Track | Stem Generation to extract the vocal stem, then layer it with fresh background music or apply targeted EQ, improving audio clarity and professional polish.

Adaptive Soundtrack Production: A game developer needs dynamic music that responds to gameplay intensity. They generate multiple stem variations—energetic vs. ambient versions of drums, bass, and strings—from a single reference track using different style prompts, then layer them conditionally during runtime.

Independent Artist Collaboration: An independent artist collaborates with a remote producer who only has access to a rough mix. Using Mureka | Generate Track | Stem Generation with specific lyrics and vocal gender parameters, they regenerate a professional vocal stem that the producer can integrate into their arrangement, accelerating the production timeline.

Things to Be Aware Of

Things to Be Aware Of

Mureka | Generate Track | Stem Generation requires a valid source—either an uploaded audio file or a song_id from a prior Mureka generation—and will fail if neither is provided or if both are specified simultaneously. Processing time varies significantly based on audio length and stem complexity; plan for 30–90 second latency per call in production workflows. The model's output quality depends heavily on the clarity and specificity of your style prompt; vague or contradictory descriptions may produce inconsistent results. Vocal stem generation mandates lyrics input; omitting this parameter will cause the request to fail. Time-window parameters (generate_start/generate_end) are optional but recommended for precision work; using them incorrectly may exclude important musical context needed for coherent stem generation.

Limitations

Limitations

Mureka | Generate Track | Stem Generation cannot generate stems without a source reference—it is not a generative model for creating music from text alone. The model may struggle with heavily compressed or low-fidelity source audio, as stem isolation relies on sufficient tonal separation in the input. Vocal stems require explicit lyrics; the model cannot infer lyrics from instrumental-only sources. Processing time scales with audio duration, making real-time stem generation impractical for very long tracks. The model's stem separation quality depends on the distinctiveness of the target instrument in the source mix; heavily layered or heavily processed sources may yield less precise isolation.

Pricing

Pricing Type: Dynamic

Track generation: $0.09 per track (Mureka V8 list). Applies to all 12 generate_type variants.