# Deepgram | Nova-3 | Speech to Text Deepgram Nova-3 accurately transcribes pre-recorded audio with word-level timestamps, speaker diarization, and automatic language detection. ## API Information - **Model Slug:** deepgram-nova-3-speech-to-text - **Branded URL:** https://www.eachlabs.ai/deepgram/nova-3/deepgram-nova-3-speech-to-text - **Provider:** Deepgram - **Category:** Voice to Text - **Output Type:** object - **Status:** active - **Version:** 0.0.1 - **Estimated Processing Time:** 10 seconds - **Last Updated:** 2026-06-08 - **Interactive Demo:** https://www.eachlabs.ai/ai-models/deepgram-nova-3-speech-to-text ## Pricing Pricing information not available. ## Input Schema | Parameter | Type | Required | Default | Constraints | Description | |-----------|------|----------|---------|-------------|-------------| | media_url | string | Yes | - | - | Audio file URL to transcribe. Supports mp3, wav, m4a, flac, ogg, webm, mp4, and 100+ audio formats. The file must be publicly accessible or an EachLabs-uploaded file. | | language_code | string | No | auto | auto,multi,tr,ur,en,nl,uk,es,ar,de,fr,it,ja,ko,pt,ru,zh,hi,bn,cs,da,fi,el,he,hu,id,ms,no,pl,ro,sk,sv,ta,te,th,vi | Language of the audio (BCP-47 code). auto: automatic detection (recommended). multi: multilingual audio with up to 10 languages. tr: Turkish, ur: Urdu, en: English, nl: Dutch, uk: Ukrainian, es: Spanish, ar: Arabic, de: German, fr: French, it: Italian, ja: Japanese, ko: Korean, pt: Portuguese, ru: Russian, zh: Chinese, hi: Hindi. Nova-3 supports 47+ languages. | | diarize | boolean | No | true | - | Identify different speakers in the audio. When enabled, each word includes a speaker ID (integer) and speaker_confidence score. Essential for multi-speaker audio like meetings, interviews, and phone calls. | | multichannel | boolean | No | false | - | Transcribe each audio channel independently. Enable when each channel contains a single speaker (e.g., stereo call recordings with one speaker per channel). Max 5 channels. Each word includes a channel index. | | smart_format | boolean | No | true | - | Auto-format currency amounts, phone numbers, email addresses, dates, and other entities for enhanced readability. Recommended for most use cases. | | punctuate | boolean | No | true | - | Add punctuation marks and capitalization to the transcript. Produces more readable output. Recommended for most use cases. | | model | string | No | nova-3 | nova-3,nova-2 | Deepgram speech recognition model. nova-3: latest generation, best accuracy, 47+ languages, recommended for all use cases. nova-2: previous generation, still available for backward compatibility. | ## Example Request ```bash curl -X POST https://api.eachlabs.ai/v1/prediction/ \ -H "X-API-Key: YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "deepgram-nova-3-speech-to-text", "input": { "media_url": "https://storage.googleapis.com/magicpoint/inputs/deepgram-nova-3-stt-input.mp3" } }' ``` ## Output Schema Response returned by `GET /v1/prediction/{id}` when the job completes: ```json { "status": "success", "predictionID": "string", "output": "object", "metrics": { "predict_time": "number (seconds)" } } ``` ## Polling ```bash curl https://api.eachlabs.ai/v1/prediction/{PREDICTION_ID} \ -H "X-API-Key: YOUR_API_KEY" ``` | Status | Meaning | |--------|---------| | `processing` | Still running — poll again | | `success` | Done — read `output` | | `error` | Failed — read `message` / `details` | ## Webhook (alternative to polling) Pass `"webhook_url": "https://your.host/path"` in the create request. Eachlabs POSTs this payload when the job ends: ```json { "exec_id": "prediction-uuid", "status": "succeeded", "output": "https://...", "error": "" } ``` `status` is `"succeeded"` or `"failed"`. `exec_id` equals the `predictionID` from create. Return 2xx within 30 seconds. ## Errors Error body: `{ "status": "error", "message": "...", "details": "..." }` | Code | Meaning | |------|---------| | `400` | Invalid input | | `401` | Missing / invalid `X-API-Key` | | `404` | Unknown model or prediction id | | `429` | Rate limit — 100 creates / min, 10 concurrent per key | | `5xx` | Retry with backoff | ## Overview **Deepgram | Nova-3 | Speech to Text Overview** Deepgram | Nova-3 | Speech to Text is a high-performance speech recognition API that converts audio into accurate text transcriptions with word-level timestamps and speaker identification. Built by Deepgram, Nova-3 represents the latest generation of their speech-to-text technology, designed specifically for real-time and streaming applications where latency matters. The model's primary differentiator is its exceptional speed: Nova-3 achieves 441.6x real-time processing with sub-300ms streaming latency, making it the fastest commercial speech-to-text solution available. This combination of accuracy and speed makes Deepgram | Nova-3 | Speech to Text ideal for voice agents, IVR systems, and any application requiring conversational responsiveness. ## Usage Notes - API Base URL: `https://api.eachlabs.ai/v1` - Authentication: send `X-API-Key: YOUR_API_KEY`. Generate a key from the Eachlabs dashboard at https://www.eachlabs.ai/dashboard/api-keys. - File-typed parameters (`*_url`, `image_url`, `video_url`, `audio_url`, etc.) accept publicly-reachable HTTPS URLs only. Upload your asset first (GCS / S3 / your CDN) and pass the resulting URL. Data-URIs and localhost URLs are rejected. - For structured parameters (arrays / objects) send real JSON values, not stringified payloads. - Monetary values are reported in USD; per-token / per-megapixel rates may be billed in micro-cents internally. - Prefer `webhook_url` over polling for long-running predictions — see the Webhook Callback section.