How do I use MiniMax Music v2 via API?

MiniMax Music v2 is accessible via the eachlabs unified API. Provide a text prompt describing the genre, mood, and vocal style; the model returns a generated music track. Billing is pay-as-you-go through eachlabs no MiniMax account is required.

What is MiniMax Music v2 best suited for?

MiniMax Music v2 is best suited for content creators, social media producers, and developers needing original, royalty-free music generation with vocal elements. It works particularly well for short-form video soundtracks, brand audio identity creation, and personalized music applications.

Example inputhover

prompt: "indie acoustic, wistful, reflective mood, quiet neighborhood streets, autumn chill, soft guitar echoes, warm lamplight through windows"
lyrics_prompt: "[verse] Fallen leaves gather by my worn-out shoes The dusk hums low, like it is humming the blues Breath in the cold, memories stirring slow Footsteps echo places I used to know [pre-chorus] A quiet heart, but the world still turns Every window glows, while my longing burns [chorus] I push open the café door, warmth meets my skin Soft chatter, soft music, a world within Steam rising gently, like thoughts I can not phrase I sit by the window, lost in a foggy haze"
audio_setting: bitrate
32000
format
"mp3"
sample_rate
16000

Minimax Music v2

Audio·minimax-music·by Minimax

MiniMax Music 2.0 transforms text prompts into high-fidelity, diverse musical compositions, blending advanced AI composition, sound design, and arrangement to deliver studio-quality tracks in seconds.

Try it now →

API reference

Runtime (p50): 1m
Estimated price: $0.03

Call the API

prediction.sh

curl -X POST \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "minimax-music-v2",
    "version": "0.0.1",
    "input": {
        "prompt": "indie acoustic, wistful, reflective mood, quiet neighborhood streets, autumn chill, soft guitar echoes, warm lamplight through windows",
        "lyrics_prompt": "[verse]\nFallen leaves gather by my worn-out shoes\nThe dusk hums low, like it is humming the blues\nBreath in the cold, memories stirring slow\nFootsteps echo places I used to know\n\n[pre-chorus]\nA quiet heart, but the world still turns\nEvery window glows, while my longing burns\n\n[chorus]\nI push open the café door, warmth meets my skin\nSoft chatter, soft music, a world within\nSteam rising gently, like thoughts I can not phrase\nI sit by the window, lost in a foggy haze",
        "audio_setting": {
            "bitrate": 32000,
            "format": "mp3",
            "sample_rate": 16000
        }
    },
    "webhook_url": ""
}' \
  https://api.eachlabs.ai/v1/prediction/

Documentation8 sections

Overview
minimax-music-v2 — Text-to-Audio AI Model

MiniMax Music 2.0, accessible as minimax-music-v2, revolutionizes music creation by transforming text prompts into studio-quality tracks with vocals, lyrics, and full instrumentation in seconds, eliminating the need for expensive studios or years of production expertise. Developed by Minimax as part of the minimax-music family, this text-to-audio AI model excels in generating professional-grade songs across genres like pop, indie, electronic, and folk, supporting durations up to 5 minutes. Users searching for "Minimax text-to-audio" or "AI music generator with lyrics" will find minimax-music-v2 delivers unmatched vocal nuance and structural logic, producing complete compositions from simple prompts or detailed lyrics.
Capabilities
- Generates full-length, studio-quality music tracks from text prompts, including both instrumental and vocal compositions
- Supports a wide range of genres, from pop, rock, and jazz to electronic, classical, and traditional music
- Can synthesize natural-sounding vocals in multiple languages, aligning melody and rhythm to provided lyrics
- Offers advanced voice controls, including emotion, pitch, speed, and vocal effects (e.g., echo, robotic, lo-fi)
- Delivers rapid generation with low latency, suitable for real-time creative workflows
- Adapts to diverse creative needs, from background music to complete songs with custom lyrics
- Maintains high audio fidelity and professional arrangement quality across outputs
Use cases
Use Cases for minimax-music-v2

Songwriters use minimax-music-v2 to prototype demos instantly; input lyrics tagged with [Intro][Verse 1][Chorus] and a style prompt like "upbeat indie folk with acoustic guitar and harmonious vocals," yielding a polished 3-minute track ready for refinement. Content creators producing videos or podcasts leverage its vocal separation for custom background music, generating instrumental or full songs that sync perfectly without post-production muddiness.

Marketers crafting brand jingles input scenario prompts such as "energetic electronic theme for tech startup with synth leads and driving bass," creating original sonic identities in under 2 minutes via the minimax-music-v2 API. Developers building "AI music generator apps" integrate its precise structure controls for apps serving musicians needing quick, high-quality outputs in diverse styles.
Tips & tricks
How to Use minimax-music-v2 on Eachlabs

Access minimax-music-v2 seamlessly on Eachlabs via the Playground for instant testing with text prompts, lyrics, and style tags, or through the API and SDK for scalable integrations. Provide a music description, optional structured lyrics, and parameters like duration up to 5 minutes to receive high-fidelity audio files in supported formats, with outputs featuring crisp vocals and instrumentation in 1-2 minutes.
---
Technical spec
What Sets minimax-music-v2 Apart

minimax-music-v2 stands out in the text-to-audio AI model landscape through its paragraph-level precision control, enabling detailed song structures with tags like [Verse], [Chorus], and [Bridge] for coherent, professional arrangements that most competitors lack. This capability allows creators to craft full songs with exact pacing and transitions, generating tracks in 1-2 minutes that rival human productions. Unlike generic music AIs, it supports over 100 instruments with studio-grade mixing that separates vocals from accompaniment, reducing muddiness in complex arrangements for crisp, high-fidelity output in multiple audio formats.
- Advanced vocal synthesis: Delivers smooth pitch transitions, natural vibrato, and resonance shifts for expressive, lifelike singing in 40+ languages, enabling authentic global tracks without manual editing.
- Style-aware mixing: Automatically adapts to genres like rock or jazz, reproducing power, distortion, or warm tones with professional spatiality and dynamic range, perfect for "Minimax music API" integrations.
- Customizable duration and structure: Handles prompts from brief ideas to full lyrics up to 5 minutes, with rapid generation times ideal for high-volume "AI song generator from text" workflows.
Things to be aware of
- Some users report that highly complex or ambiguous prompts may produce less coherent or musically focused results
- The model’s vocal synthesis is generally praised for naturalness, but may occasionally sound synthetic or lack emotional nuance in certain languages or genres
- Performance benchmarks indicate fast generation times, but resource requirements may increase with longer or higher-quality tracks
- Consistency across multiple generations can vary; iterative refinement is often necessary for optimal results
- Positive feedback highlights the model’s versatility, ease of use, and ability to quickly generate professional-sounding music
- Common concerns include occasional artifacts in vocal tracks, limited fine-grained control over arrangement details, and the need for post-processing in some cases
- Experimental features, such as advanced voice cloning or multi-language support, may be subject to ongoing updates and improvements
Key considerations
- The quality of the generated music is highly dependent on the specificity and clarity of the input prompt; detailed prompts yield more targeted results
- For best results, provide both a descriptive prompt and lyrics if vocal output is desired
- Adjusting parameters such as sample rate and bitrate can impact both quality and generation speed
- Overly vague or conflicting prompts may result in less coherent or generic outputs
- Iterative refinement—regenerating with adjusted prompts—can significantly improve final results
- Prompt engineering is crucial: specifying genre, mood, tempo, and instrumentation leads to more predictable outcomes
- There is a trade-off between generation speed and output complexity; higher quality or longer tracks may take slightly longer to generate
Limitations
- The model may struggle with highly intricate musical structures or unconventional genres not well represented in its training data
- Fine control over specific arrangement elements (e.g., precise instrument placement, advanced mixing) is limited compared to manual production
- Not optimal for scenarios requiring human-level emotional depth or nuanced vocal performance in all languages and styles

Related models

4 models

Mureka · Generate Remix AI model preview

Mureka · Generate RemixMureka

Mureka Generate Track · Stem Generation AI model preview

Mureka Generate Track · Stem GenerationMureka

Google · Text to Speech AI model preview

Google · Text to SpeechGoogle

Kling V1 · Text to Speech AI model preview

Kling V1 · Text to SpeechKling

* FAQ

About Minimax Music v2

01 / 03

What is MiniMax Music v2?

MiniMax Music v2 is an AI music generation model by MiniMax that creates high-quality audio compositions from text prompts. It generates complete music tracks with vocal and instrumental elements across multiple genres, with improved audio fidelity and stylistic range over the previous version.

Minimax Music v2

minimax-music-v2 — Text-to-Audio AI Model

Use Cases for minimax-music-v2

How to Use minimax-music-v2 on Eachlabs

What Sets minimax-music-v2 Apart

Related models

About Minimax Music v2

What is MiniMax Music v2?