Best-in-class speech recognition and text-to-speech models for African accents
Beats OpenAI, Google, AWS,
Azure across multiple benchmarks
Today, we’re launching Sahara — a breakthrough family of speech recognition models trained on thousands of hours of proprietary audio from 18,000+ speakers, across 300+ non-native English accents from 30+ African countries. Powered by our proprietary AccentMix™ algorithm, Sahara doesn’t just keep up — it outperforms OpenAI’s Whisper, GPT-4o Transcribe, Nvidia Canary, Google Speech-to-Text, AWS Transcribe, and Azure Speech across the board.
We’re talking scripted or conversational, and real-world, in-the-wild speech — across high-impact domains like healthcare, finance, legal, African named entities, and voice commands. No accent left behind.
Here’s the kicker: we’re a tiny, seed-stage startup. We don’t have the luxury of bottomless compute or internet-scale data. So we had to do things differently — leaner, smarter, and relentlessly focused on real-world performance. That’s how AccentMix was born: a patented algorithm purpose-built to handle the rich diversity of English accents across the African continent.
We’re incredibly proud of what we’ve built.
See the benchmarks. Hear the difference. Welcome to Sahara.
Medical
Here’s the kicker: we’re a tiny, seed-stage startup. We don’t have the luxury of bottomless compute or internet-scale data. So we had to do things differently — leaner, smarter, and relentlessly focused on real-world performance.
Ceo of Company
Medical
Here’s the kicker: we’re a tiny, seed-stage startup. We don’t have the luxury of bottomless compute or internet-scale data. So we had to do things differently — leaner, smarter, and relentlessly focused on real-world performance.
Ceo of Company
Medical
Here’s the kicker: we’re a tiny, seed-stage startup. We don’t have the luxury of bottomless compute or internet-scale data. So we had to do things differently — leaner, smarter, and relentlessly focused on real-world performance.
Ceo of Company
Medical
Here’s the kicker: we’re a tiny, seed-stage startup. We don’t have the luxury of bottomless compute or internet-scale data. So we had to do things differently — leaner, smarter, and relentlessly focused on real-world performance.
Ceo of Company
Word Error Rate (WER) is a common way to measure how accurate speech recognition systems are. It compares what the system heard to what was actually said. It measures the model’s ERROR, so lower is better. It divides the number of word-level errors by the total number of words
Word Error Rate (WER) is a common way to measure how accurate speech recognition systems are. It compares what the system heard to what was actually said. It measures the model’s ERROR, so lower is better. It divides the number of word-level errors by the total number of words
Why it matters: WER tells us how reliable a speech-to-text system is. A lower WER means fewer mistakes and better performance—critical for areas like healthcare, legal, and customer service.
Strengths: Simple to calculate, Easy to compare different systems, Works across languages
Weaknesses: Treats all errors equally—even if some are more harmful (e.g., “don’t take” vs “take”); Even single character errors like carrot vs carot get full penalty, so it can be overly harsh and punitive; Doesn’t consider punctuation or context; May not reflect user satisfaction or usefulness
WER = (Substitutions +
Insertions + Deletions) ÷ Total Words
Substitutions: wrong words
Insertions: extra words
Deletions: missing words
Spoken: “Take your medicine daily”
Transcript: “Take your message daily”
WER = 1 error ÷ 4 words = 25%
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Sahara demonstrates superior performance in accented voice recognition in healthcare, leading several open and closed models in recognition of complex medical terminologies across specialties, with various diagnosis, measurements, imaging and lab results, and medications in over 300 African accents under diverse ambient clinical settings
a 200+hr public benchmark dataset of scripted (read) clinical speech in 120 African accents from 2,463 speakers in 13 countries
a public pan-African conversational speech dataset of 49 spontaneous medical and non-medical conversations with 14 African accents across 3 countries
a multi-institution multi-specialty dataset of real world medical speech in real-world clinical settings across 6 countries, 200+ speakers and >50 accents
a multi-country dataset of real world doctors testing out voice transcription in various clinical settings with significant ambient hospital noise
an unreleased medical multispecialty dataset of 25 simulated long-from doctor-patient conversations from male and female doctor- and patient actors across Nigeria
an unreleased dataset of 30+ minute-long multispeaker clinical research interviews from East Africa
an unreleased dataset of real-world telephone call center conversations between various agents and customers sampled at 8kHz.
a 2hr subset with voice commands for multiple scenarios
a 2hr subset of the Afri-Names dataset rich in numbers, fractions, measurements, decimals, currency, etc
our most challenging dataset ever, a 20+ hour novel open pan-African accented read speech dataset rich with African named entities, proper nouns, numbers, fractions, currency, simulated IDs, and voice-assistant commands for evaluation ASR models on various tasks and domains like finance, healthcare, and speech commands, with 500+ unique speakers from 20+ countries.
a 10hr subset rich in African names
a 5hr subset with voice commands for multiple scenarios
a 5hr subset rich in numbers, fractions, measurements, decimals, currency, etc
a 35+ hour open pan-African transcribed dataset of legislative proceedings with ambient noise, multiple speakers, African names and locations, with over 1000 speakers from 4 countries
an unreleased African accented dataset of court hearings rich in legal terminology, proper nouns and latin words
a multi-country multi-accent dataset with 2+ hrs of read/scripted and conversational speech from Nigeria, South Africa, Kenya, Ghana, Rwanda, and North Africa (Egypt, Morocco, Algeria, etc)
general purpose cross-domain speech recognition model
streaming model optimised for medical conversations
biometric voice-based authentication tuned for African accents and languages to combat fraud
first production pan-African accented speech synthesis model supporting 30 African accents spoken across 10+ countries
SOTA automatic speech translation models on 20 African languages
general purpose cross-domain speech recognition model
streaming model optimised for medical conversations
biometric voice-based authentication tuned for African accents and languages to combat fraud
first production pan-African accented speech synthesis model supporting 30 African accents spoken across 10+ countries
SOTA automatic speech translation models on 20 African languages
The first production pan-African accented speech synthesis model with 54 personas from 13 countries, representing 34 African accents with female and male voices.
Spoof-aware Voice authentication and security, tuned for African voices, accents and languages to combat fraud and deepfakes
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
SOTA automatic speech translation models on 20 African languages
Punching way above models 2-3x its size, Sahara demonstrates superior performance on Accented English speech in a pan-African context across multiple industries (health, finance, legal, academia, etc) and domains with impressive robustness to background noise, intonations, and domain-specific vocabulary.
Legal & Parliamentary
Spoof-aware Voice authentication and security, tuned for African voices, accents and languages to combat fraud and deepfakes