# Incredibly Fast Whisper Transcribe 150 minutes of audio in 100 seconds with Incredibly Fast Fhisper ## API Information - **Model Slug:** incredibly-fast-whisper - **Branded URL:** https://www.eachlabs.ai/openai/whisper/incredibly-fast-whisper - **Provider:** OpenAI - **Category:** Voice to Text - **Output Type:** text - **Status:** active - **Version:** 0.0.1 - **Base Cost:** Per-second pricing based on provider predict_time. Rate: $0.00108/sec from GPU tier. - **Estimated Processing Time:** 11 seconds - **Last Updated:** 2026-04-06 - **Interactive Demo:** https://www.eachlabs.ai/ai-models/incredibly-fast-whisper ## Pricing - **Charge Type:** dynamic - **Pricing Details:** Per-second pricing based on provider predict_time. Rate: $0.00108/sec from GPU tier. ### Pricing Rules | Condition | Pricing | | --- | --- | | Rule 1 | Per-second pricing based on provider predict_time. Rate: $0.00108/sec from GPU tier. | ## Input Schema | Parameter | Type | Required | Default | Constraints | Description | |-----------|------|----------|---------|-------------|-------------| | audio | string | Yes | https://storage.googleapis.com/magicpoint/inputs/fast-whisper-input.wav | audio/mp3, audio/wav, audio/m4a | Audio refers to the sound or recording that is being analyzed or processed. | | task | string | No | transcribe | transcribe | Task defines the specific operation or activity the model is required to perform on the input audio. | | language | string | No | None | None,afrikaans,amharic,arabic,azerbaijani,belarusian,bosnian,breton,bulgarian,cantonese,catalan,chinese,croatian,czech,danish,dutch,english,estonian,finnish,french,german,hebrew,hindi,hungarian,italian,japanese,korean,lithuanian,macedonian,mongolian,myanmar,nepali,norwegian,polish,portuguese,romanian,russian,serbian,slovak,slovenian,spanish,swahili,swedish,tatar,telugu,turkish,ukrainian,welsh | Language refers to the specific language in which the input audio is provided. | | batch_size | integer | No | 24 | - | Batch size is the number of samples processed together in one iteration. | | timestamp | string | No | chunk | chunk,word | Timestamp denotes the specific time at which an event occurs in the audio. | | diarise_audio | boolean | No | false | - | Diarise audio involves splitting a conversation into segments based on who is speaking. | | hf_token | string | No | - | - | HF token is a special key used to authenticate and access resources on the Hugging Face platform. | ## Example Request ```bash curl -X POST https://api.eachlabs.ai/v1/prediction/ \ -H "X-API-Key: YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "incredibly-fast-whisper", "input": { "audio": "https://storage.googleapis.com/magicpoint/inputs/fast-whisper-input.wav" } }' ``` ## Output Schema Response returned by `GET /v1/prediction/{id}` when the job completes: ```json { "status": "success", "predictionID": "string", "output": "text", "metrics": { "predict_time": "number (seconds)" } } ``` ## Polling ```bash curl https://api.eachlabs.ai/v1/prediction/{PREDICTION_ID} \ -H "X-API-Key: YOUR_API_KEY" ``` | Status | Meaning | |--------|---------| | `processing` | Still running — poll again | | `success` | Done — read `output` | | `error` | Failed — read `message` / `details` | ## Webhook (alternative to polling) Pass `"webhook_url": "https://your.host/path"` in the create request. Eachlabs POSTs this payload when the job ends: ```json { "exec_id": "prediction-uuid", "status": "succeeded", "output": "https://...", "error": "" } ``` `status` is `"succeeded"` or `"failed"`. `exec_id` equals the `predictionID` from create. Return 2xx within 30 seconds. ## Errors Error body: `{ "status": "error", "message": "...", "details": "..." }` | Code | Meaning | |------|---------| | `400` | Invalid input | | `401` | Missing / invalid `X-API-Key` | | `404` | Unknown model or prediction id | | `429` | Rate limit — 100 creates / min, 10 concurrent per key | | `5xx` | Retry with backoff | ## Overview **incredibly-fast-whisper — Voice-to-Text AI Model** incredibly-fast-whisper revolutionizes **voice-to-text** transcription by processing 150 minutes of audio in just 100 seconds, delivering unmatched speed for developers and creators handling large audio files. Developed as an optimized fork of OpenAI's Whisper family, this **voice-to-text AI model** tackles the common bottleneck of slow ASR processing without sacrificing accuracy on accents, noise, or long-form content. Whether you're building **OpenAI voice-to-text** pipelines or seeking **fast audio transcription API** solutions, incredibly-fast-whisper stands out for its blistering performance on extended recordings. ## Usage Notes - API Base URL: `https://api.eachlabs.ai/v1` - Authentication: send `X-API-Key: YOUR_API_KEY`. Generate a key from the Eachlabs dashboard at https://www.eachlabs.ai/dashboard/api-keys. - File-typed parameters (`*_url`, `image_url`, `video_url`, `audio_url`, etc.) accept publicly-reachable HTTPS URLs only. Upload your asset first (GCS / S3 / your CDN) and pass the resulting URL. Data-URIs and localhost URLs are rejected. - For structured parameters (arrays / objects) send real JSON values, not stringified payloads. - Monetary values are reported in USD; per-token / per-megapixel rates may be billed in micro-cents internally. - Prefer `webhook_url` over polling for long-running predictions — see the Webhook Callback section.