Africa speaks. Sahara v2 listens—accurately, reliably, at scale.
For years, voice has been called the most natural interface. Yet for hundreds of millions of people across Africa, digital systems still struggle to understand how they speak—their accents, their names, their numbers, their languages, their silence.
Most speech recognition systems were never designed for Africa. They were trained on Western speech, evaluated on Western benchmarks, and optimized for clean audio and high-resource languages. The result? Models that perform well on global leaderboards—but break down in the real world across African healthcare, finance, government, and customer support.
Today, we’re changing that.
We’re proud to introduce Intron Sahara v2:
a production-ready suite of speech recognition models built specifically for African languages, African accents, and African realities.
Across hospitals, call centers, banks, and public services, we consistently hear the same feedback:
These are not edge cases in Africa—they are the default conditions:
Sahara v2 was built for this reality—not adapted to it.
Africa’s first production-grade bilingual and multilingual speech recognition models, supporting:
It delivers state-of-the-art performance across:
High-quality text-to-speech with:
Designed for IVR, voice bots, education, and public-facing services—without the “imported accent” problem.
Across 20+ African languages, Sahara v2 consistently delivers the lowest Word Error Rates, often outperforming strong multilingual baselines by 2×–7×.
Languages where Sahara v2 is state-of-the-art include:
Ga, Twi, Igbo, Yoruba, Sesotho, Pedi, Tswana, Zulu, Hausa, Shona, Swahili, and more.
On accented English across medical, parliamentary, and conversational speech, Sahara v2 achieves:
Designed for IVR, voice bots, education, and public-facing services—without the “imported accent” problem.
Unlike many general-purpose models, Sahara v2 is explicitly trained to handle:
Sahara v2 is powered by millions of audio clips contributed by speakers across Africa—spanning languages, accents, professions, and environments.
We want to go further.
We’re inviting developers, data scientists, and startups to build the next generation of African voice applications using Sahara.
📢 Africa’s voice ecosystem grows when we build together.
Developers can integrate Sahara-v2 using the new streamlined widget or deploy with full offline support. Proven in real-world deployments with partners including Penda Health, Data.FI, ARM, and State High Courts across Nigeria, Kenya, South Africa, and Eswatini, Sahara-v2 is transforming how organizations serve their customers.
Sahara-v2 functions without internet connectivity. By processing all data locally, the system ensures privacy and security for sensitive environments such as healthcare, legal, and finance. This on-device approach protects confidential records, supports regulatory compliance, and enables reliable deployment in remote or low-connectivity locations.
Sahara-v2 delivers state-of-the-art performance for African speech understanding, supporting African French and 23 new African languages. It outperforms competing models, achieving 25% better overall performance compared with Meta Omni-language ASR and Gemini-3.
On the AfriVox Transcribe Benchmark, Sahara-v2 demonstrates exceptional accuracy where precision is critical. It performs over 64% better on African names, locations, and organizations (AFRINAMES) compared to models such as Gemini-3 and Azure, and over 35% better on numbers, IDs, decimals, and currency. It also proves reliable across real-world use cases, performing over 25% better across key verticals, including health, legal, finance, and call center audio.
Beyond single-language recognition, Sahara-v2 advances multilingual understanding with the world’s first bilingual Swahili-English ASR model. It also demonstrates strong robustness in challenging audio conditions, testing over 20% better on background noise, overlapping speakers, and silence compared to competitors.
Sahara-v2 performs strongly when evaluated against leading speech models, including Gemini-3, Azure, ElevenLabs, GPT-4-audio, and Whisper, consistently outperforming them across accuracy, robustness, and domain reliability.
On African names, locations, and organizations, Sahara-v2 performs over 64% better, reflecting its ability to handle culturally specific entities that are frequently misrecognized by global models. This is measured using the AFRINAMES evaluation.
Sahara-v2 achieves over 35% better performance on numbers, IDs, decimals, currency, monetary values, and fractions, supporting precision-critical use cases in finance, healthcare, and legal documentation. These results are benchmarked using the NUMBERS evaluation.
In noisy, real-world conditions, Sahara-v2 performs over 20% better, handling background noise, overlapping speakers, and silence more effectively than competing systems. Benchmarks compare Sahara-v2 with Gemini-3, Azure, Deepgram, GPT-4-audio, and Whisper.
Across domain-specific audio in health, legal, finance, and call center environments, Sahara-v2 performs over 25% better, ensuring reliable transcription where domain terminology and structured speech are essential.
Sahara-v2 now supports African French and 23 new African languages, bringing the total to 57. It delivers 25% better overall performance compared with Meta Omni-language ASR and Gemini-3.
Sahara-v2 pushes the boundaries of speech understanding with improved robustness, delivering enhanced acoustic modeling to help you handle even the most challenging audio scenarios.
In testing, Sahara-v2 demonstrates superior performance on challenging audio, testing over 20% better regarding robustness (specifically background noise, overlapping speakers, and silence) compared to competitors like Gemini-3 and Azure. It also proves its reliability in specialized fields, performing over 25% better across verticals such as health, legal, finance, and call center audio. This capability is validated by real-world deployments with partners like Penda Health and the Ogun State High Court.
Sahara-v2 is built from the ground up to master the complexities of African speech, supporting over 500 accents while thriving in real-world acoustic environments. It delivers production-ready transcription across medical, legal, and call center sectors, outperforming industry standards by over 25% in these critical verticals. By supporting 57 languages, including 23 new African languages and African French, the model leverages offline and parallel processing to provide high-performance transcription at an enterprise scale.
More than just processing audio, Sahara-v2 understands how people actually speak. It introduces the world’s first bilingual Swahili–English ASR, enabling seamless code-switching and natural conversation capture, a breakthrough validated by Penda Health. Sahara-v2 is built for high-stakes enterprise settings, delivering precise transcription of African names, numbers, and citations. Even with heavy background noise or overlapping speech, it maintains superior robustness. Partners such as the Ogun and Yobe State High Courts, ARM, and Data.FI already rely on this capability.
Sahara-v2 has emerged as the leading model in the African linguistic landscape, consistently delivering the lowest Word Error Rates (WER) across every language evaluated. Its strongest benchmarks come in Pidgin (5%), Kinyarwanda (10%), and Swahili (11%). This is most evident in high-stakes comparisons: in Kinyarwanda, it secures a 10% WER against Gemini-3.0-flash’s 40%, while in Tswana, it holds a 22% WER as Gemini falters at 77%.
Across all evaluated languages, Sahara-v2 maintains superior accuracy, outclassing rivals in Twi (11%), Zulu (16%), Hausa (18%), Shona (18%), Yoruba (19%), Luganda (19%), Igbo (20%), and Pedi (23%). Its most significant breakthrough is its exclusive mastery of Pidgin. While Meta-Omni-ASR and Gemini-3.0-flash fail to produce any measurable output for the dialect, Sahara-v2 delivers seamless, high-fidelity transcription. These findings position Sahara-v2 as the sole capable solution for low-resource African language transcription.
Higher accuracy on African names, locations, and organizations
Higher accuracy on numbers, IDs, decimals, and currency
Better performance across healthcare, legal, finance, and call‑center audio
Stronger robustness in noisy, real‑world environments
Sahara-v2 was built to process African speech across multiple modalities. This includes the world’s first bilingual Swahili-English ASR model, support for 500+ accents through its new Accented English model, and resilience in challenging acoustic environments. By combining state-of-the-art acoustic modeling, multilingual performance across 57 African languages, and full offline capability, Sahara-v2 pushes the frontier of African speech understanding.
Legal professionals, such as those at the Ogun State High Court, can rely on Sahara-v2 to capture African names, locations, and organizations with precision — a task where the model performs over 64% better than competitors. Healthcare providers like Penda Health and Data.FI can also record patient consultations in bilingual Swahili-English, as Sahara-v2 performs over 25% better across verticals, including health.
Sahara-v2 also processes audio from busy call center environments, such as deployments with partner ARM, filtering through background noise and overlapping speakers to deliver accurate transcripts. In these conditions, the model performs over 20% better in robustness than other major speech models.
Sahara-v2 makes any voice application possible for African contexts. It handles complex acoustic variations to enable reliable voice interfaces, testing over 20% better in robustness against background noise, overlapping speakers, and silence compared to competitors.
Sahara-v2 is our most capable voice AI model built for real-world use so far. It makes applications more accessible with specialized capabilities for Voice Bots, Voice Autofill for KYC, application, and admission forms, and Voice Banking. The model supports accurate capture of structured data like names, IDs, and numbers, performing over 35% better where precision matters. It also handles safety-critical voice interactions, performing over 25% better across verticals such as health, legal, and finance.
You can now build with Sahara-v2 using the new streamlined widget for integrations and participate in the Build with Sahara Developer challenge. It is also available with offline support to enable deployment in diverse environments.
Accelerating African voice AI development with Sahara-v2
As voice AI accelerates with Sahara-v2, we are enhancing the developer experience for African voice applications. Today, we are introducing a new streamlined widget for integrations and launching the Build with Sahara Developer challenge to help developers build and deploy solutions faster.
Sahara-v2’s advanced speech understanding and offline support give developers the tools to build robust voice applications. They can now tap into capabilities such as Voice Bots (available in 7 languages and accented English), Voice Autofill for KYC, application, and admission forms, and Voice Banking for command-driven fintech interactions. These features are built to handle complex African contexts, backed by Sahara-v2’s performance across 23 new African languages.
Sahara-v2 marks a significant step forward in speech recognition, particularly in how it handles challenging audio environments. It tests over 20% better than competitors like Gemini-3 and Azure when it comes to background noise, overlapping speakers, and silence. It handles natural language complexity through the world’s first bilingual Swahili-English ASR model and voice bot support across seven languages, and enables you to connect any application or workflow through reliable voice interaction.
The model is built for critical domains, performing over 25% better across verticals, including health, legal, and finance. To support these sectors, we are introducing Voice Banking for command-driven fintech interactions and Voice Autofill for processing sensitive data in KYC, application, and admission forms.
Sahara-v2 has been validated by partners in high-stakes, real-world environments. This includes legal applications with the Ogun State High Court and Yobe State High Court, as well as healthcare implementations with Penda Health (Kenya) and Data.FI (Eswatini & Nigeria).
The Sahara-v2 era starts today. Here is what is rolling out:
We are inviting developers across Africa to build the next generation of voice-enabled applications with Sahara-v2. The new streamlined widget for integrations and our offline support are now available to help you build robust solutions.
Handle complex medical terminology and bilingual Swahili-English interactions
Execute voice commands with precision for numbers, currency, and IDs
Transcribe legal proceedings with high accuracy on African names
Deploy bots in 7 languages plus accented English
ARM is leveraging Sahara-v2 to optimize customer service operations within its call centers. The system is specifically validated for high-traffic environments, designed to maintain clarity even when dealing with background noise and overlapping speakers. By utilizing specialized accented English models, ARM ensures that diverse customer voices are understood accurately, providing a more reliable automated support experience than standard global alternatives.
Penda Health relies on the world’s first bilingual Swahili-English model to capture patient interactions, a critical feature for environments where speakers naturally switch between languages (code-switching). These healthcare deployments are supported by Voice Autofill, which helps streamline admission forms and ensures that domain-specific medical terminology is captured with high precision.
The Ogun State High Court and Yobe State High Court in Nigeria have adopted Sahara-v2 to support legal transcription. To meet strict requirements for privacy and reliability, these courts utilize the system’s offline capabilities, allowing sensitive proceedings to be processed locally without depending on internet connectivity. The technology is specifically optimized to recognize African names, locations, and organizations, ensuring that the judicial record remains accurate and preserves local context.
Sahara-v2 is available now for developers, enterprises, and organizations across Africa.
Access the new streamlined widget for integrations
Empower your organization with specialized voice technology.
Gain deep insights into the evolving technological landscape with our upcoming 2026 Africa Voice AI Report. The report explores the trends, challenges, and opportunities shaping the future of speech interfaces across the continent.