
ElevenLabs Speech-to-Text Model: Revolutionizing Transcriptions
In a groundbreaking move, ElevenLabs, an AI startup renowned for its audio generation capabilities, is venturing into the realm of speech-to-text technology with the launch of its new model, Scribe. Following a significant $180 million funding boost, this $3.3 billion company aims to redefine the landscape of speech detection, positioning itself as a formidable competitor against established players like OpenAI and AssemblyAI. Scribe promises impressive performance, boasting support for over 99 languages and exceptional accuracy rates, setting the stage for enhanced transcription services. As the AI landscape evolves, ElevenLabs is determined to push the boundaries of what speech detection can achieve.
Feature | Details |
---|---|
Company Name | ElevenLabs |
Funding Raised | $180 million |
Valuation | $3.3 billion |
Product Name | Scribe (Speech-to-Text Model) |
Languages Supported | Over 99 languages |
Languages with Excellent Accuracy | 25 languages (Word error rate < 5%) |
Languages with High Accuracy | 5% to 10% word error rate |
Languages with Good Accuracy | 10% to 20% word error rate |
Languages with Moderate Accuracy | 25% to 50% word error rate |
Model Performance | Outperformed Google Gemini 2.0 and Whisper Large V3 in tests |
Key Features | Smart speaker diarization, word-level timestamps, auto-tagging of sound events |
Transcription Type | Pre-recorded audio formats only |
Pricing | $0.40 per hour of transcribed audio |
Future Updates | Low-latency real-time version to be released |
Overview of ElevenLabs’ Scribe Model
ElevenLabs has recently made headlines by launching its new speech-to-text model called Scribe. This development marks a significant shift for the AI startup, which has primarily been known for its impressive audio generation capabilities. With a valuation of $3.3 billion and a substantial funding boost of $180 million, ElevenLabs is now branching into the speech detection arena, hoping to compete with established players like OpenAI and Deepgram.
The Scribe model is designed to support over 99 languages, showcasing ElevenLabs’ commitment to accessibility and inclusivity. With more than 25 languages achieving excellent accuracy rates, Scribe aims to provide users with reliable and precise transcriptions. This launch not only expands ElevenLabs’ portfolio but also positions the company as a formidable player in the evolving landscape of AI-driven speech technologies.
Frequently Asked Questions
What is ElevenLabs’ new speech-to-text model called?
ElevenLabs has launched its new speech-to-text model named Scribe, marking its first standalone product in this category.
How many languages does the Scribe model support?
Scribe supports over 99 languages at launch, with excellent accuracy for more than 25 of them.
What is the accuracy rate for English in the Scribe model?
The Scribe model claims a 97% accuracy rate for English, placing it in the excellent category for speech recognition.
How does Scribe compare to other models like Google Gemini?
Scribe has outperformed Google Gemini 2.0 Flash and Whisper Large V3 in multiple languages, according to benchmark tests.
What features does the Scribe model include?
Scribe offers smart speaker diarization, word-level timestamps, and auto-tagging of sound events to enhance transcription quality.
Is Scribe available for real-time transcription?
Currently, Scribe only transcribes pre-recorded audio, but a real-time version will be released soon for live transcriptions.
How much does it cost to use Scribe for transcriptions?
The pricing for Scribe is set at $0.40 per hour of transcribed audio, which is competitive in the market.
Summary
ElevenLabs, an AI startup valued at $3.3 billion, has launched its first speech-to-text model called Scribe, following significant funding of $180 million. This model supports over 99 languages, with 25 languages, including English, French, and Spanish, achieving excellent accuracy rates below 5% for transcription errors. Scribe outperforms competitors like Google and OpenAI in various tests and includes features like speaker identification and accurate timestamps for subtitles. Priced at $0.40 per hour of audio, Scribe aims to enhance speech detection capabilities, moving beyond just generating content.