Sahara v2: When Voice AI
Finally Understands Africa

Africa speaks. Sahara v2 listens—accurately, reliably, at scale.

Introducing Sahara-v2

For years, voice has been called the most natural interface. Yet for hundreds of millions of people across Africa, digital systems still struggle to understand how they speak—their accents, their names, their numbers, their languages, their silence.

Most speech recognition systems were never designed for Africa. They were trained on Western speech, evaluated on Western benchmarks, and optimized for clean audio and high-resource languages. The result? Models that perform well on global leaderboards—but break down in the real world across African healthcare, finance, government, and customer support.

Today, we’re changing that.

We’re proud to introduce Intron Sahara v2:
a production-ready suite of speech recognition models built specifically for African languages, African accents, and African realities.

The Problem: When Global Models Meet African Speech

Across hospitals, call centers, banks, and public services, we consistently hear the same feedback:

“It works on demos, but fails with real users.”
“It struggles with African names and entities.”
“Numbers, currencies, IDs—everything gets mangled.”
“Silence and background noise cause hallucinations.”

These are not edge cases in Africa—they are the default conditions:

Heavy accent variation
Code-switching across languages
Dense named entities (people, places, organizations)
Numeric-heavy speech (amounts, IDs, dosages, balances)
Noisy, low-resource, conversational audio

Sahara v2 was built for this reality—not adapted to it.

Meet the Sahara v2 Suite

Sahara v2 (ASR)

Africa’s first production-grade bilingual and multilingual speech recognition models, supporting:

Accented English & French
20+ African languages
500+ African accents
Optimized for short audios, conversations, and limited context

It delivers state-of-the-art performance across:

Medical, finance, legal, and call-center speech
African personal, organizational, and geographic names
Numeric precision: currencies, decimals, IDs, measurements
Noise, silence, and overlapping speakers

Sahara TTS

High-quality text-to-speech with:

Accented English & French
5+ African languages
Voices that sound local, familiar, and natural

Designed for IVR, voice bots, education, and public-facing services—without the “imported accent” problem.

Performance That Actually Transfers to the Real World

Multilingual African Speech (Afrivox Benchmark)

Across 20+ African languages, Sahara v2 consistently delivers the lowest Word Error Rates, often outperforming strong multilingual baselines by 2×–7×.

Languages where Sahara v2 is state-of-the-art include:
Ga, Twi, Igbo, Yoruba, Sesotho, Pedi, Tswana, Zulu, Hausa, Shona, Swahili, and more.

African Accented English (Industry-Specific Benchmarks)

On accented English across medical, parliamentary, and conversational speech, Sahara v2 achieves:

WER <15% across all major domains
~18% on medical conversations
~12% on African named entities
~8% on numeric-heavy financial speech

Designed for IVR, voice bots, education, and public-facing services—without the “imported accent” problem.

Industry-Leading Noise & Silence Robustness

Unlike many general-purpose models, Sahara v2 is explicitly trained to handle:

Silent segments (no hallucinations)
Short utterances
Intervening silence
Background noise
Overlapping speakers

Built With the Community, For the Community

Sahara v2 is powered by millions of audio clips contributed by speakers across Africa—spanning languages, accents, professions, and environments.

We want to go further.

Build With Sahara – Developer Challenge

We’re inviting developers, data scientists, and startups to build the next generation of African voice applications using Sahara.

Health, finance, telco, education, agriculture, and more
Benchmark Sahara against global alternatives
Win prizes, visibility, and partnerships

📢 Africa’s voice ecosystem grows when we build together.

Built for Real Use, at Real Scale

Speech Recognition That Finally Understands Africa

Developers can integrate Sahara-v2 using the new streamlined widget or deploy with full offline support. Proven in real-world deployments with partners including Penda Health, Data.FI, ARM, and State High Courts across Nigeria, Kenya, South Africa, and Eswatini, Sahara-v2 is transforming how organizations serve their customers.

Privacy

Sahara-v2 functions without internet connectivity. By processing all data locally, the system ensures privacy and security for sensitive environments such as healthcare, legal, and finance. This on-device approach protects confidential records, supports regulatory compliance, and enables reliable deployment in remote or low-connectivity locations.

Speech recognition with exceptional accuracy and depth

Sahara-v2 delivers state-of-the-art performance for African speech understanding, supporting African French and 23 new African languages. It outperforms competing models, achieving 25% better overall performance compared with Meta Omni-language ASR and Gemini-3.

On the AfriVox Transcribe Benchmark, Sahara-v2 demonstrates exceptional accuracy where precision is critical. It performs over 64% better on African names, locations, and organizations (AFRINAMES) compared to models such as Gemini-3 and Azure, and over 35% better on numbers, IDs, decimals, and currency. It also proves reliable across real-world use cases, performing over 25% better across key verticals, including health, legal, finance, and call center audio.

Beyond single-language recognition, Sahara-v2 advances multilingual understanding with the world’s first bilingual Swahili-English ASR model. It also demonstrates strong robustness in challenging audio conditions, testing over 20% better on background noise, overlapping speakers, and silence compared to competitors.

Highlights

Built for Real Use, at Real Scale

Sahara-v2 pushes the boundaries of speech understanding with improved robustness, delivering enhanced acoustic modeling to help you handle even the most challenging audio scenarios.

In testing, Sahara-v2 demonstrates superior performance on challenging audio, testing over 20% better regarding robustness (specifically background noise, overlapping speakers, and silence) compared to competitors like Gemini-3 and Azure. It also proves its reliability in specialized fields, performing over 25% better across verticals such as health, legal, finance, and call center audio. This capability is validated by real-world deployments with partners like Penda Health and the Ogun State High Court.

Sahara-v2 helps you transcribe, understand, build, and connect anything

Transcribe anything

Sahara-v2 is built from the ground up to master the complexities of African speech, supporting over 500 accents while thriving in real-world acoustic environments. It delivers production-ready transcription across medical, legal, and call center sectors, outperforming industry standards by over 25% in these critical verticals. By supporting 57 languages, including 23 new African languages and African French, the model leverages offline and parallel processing to provide high-performance transcription at an enterprise scale.

More than just processing audio, Sahara-v2 understands how people actually speak. It introduces the world’s first bilingual Swahili–English ASR, enabling seamless code-switching and natural conversation capture, a breakthrough validated by Penda Health. Sahara-v2 is built for high-stakes enterprise settings, delivering precise transcription of African names, numbers, and citations. Even with heavy background noise or overlapping speech, it maintains superior robustness. Partners such as the Ogun and Yobe State High Courts, ARM, and Data.FI already rely on this capability.

Benchmark Excellence

Sahara-v2 has emerged as the leading model in the African linguistic landscape, consistently delivering the lowest Word Error Rates (WER) across every language evaluated. Its strongest benchmarks come in Pidgin (5%), Kinyarwanda (10%), and Swahili (11%). This is most evident in high-stakes comparisons: in Kinyarwanda, it secures a 10% WER against Gemini-3.0-flash’s 40%, while in Tswana, it holds a 22% WER as Gemini falters at 77%.

Across all evaluated languages, Sahara-v2 maintains superior accuracy, outclassing rivals in Twi (11%), Zulu (16%), Hausa (18%), Shona (18%), Yoruba (19%), Luganda (19%), Igbo (20%), and Pedi (23%). Its most significant breakthrough is its exclusive mastery of Pidgin. While Meta-Omni-ASR and Gemini-3.0-flash fail to produce any measurable output for the dialect, Sahara-v2 delivers seamless, high-fidelity transcription. These findings position Sahara-v2 as the sole capable solution for low-resource African language transcription.

0 %+

Higher accuracy on African names, locations, and organizations

0 %+

Higher accuracy on numbers, IDs, decimals, and currency

0 %+

Better performance across healthcare, legal, finance, and call‑center audio

0 %+

Stronger robustness in noisy, real‑world environments

Understand anything

Sahara-v2 was built to process African speech across multiple modalities. This includes the world’s first bilingual Swahili-English ASR model, support for 500+ accents through its new Accented English model, and resilience in challenging acoustic environments. By combining state-of-the-art acoustic modeling, multilingual performance across 57 African languages, and full offline capability, Sahara-v2 pushes the frontier of African speech understanding.

Legal professionals, such as those at the Ogun State High Court, can rely on Sahara-v2 to capture African names, locations, and organizations with precision — a task where the model performs over 64% better than competitors. Healthcare providers like Penda Health and Data.FI can also record patient consultations in bilingual Swahili-English, as Sahara-v2 performs over 25% better across verticals, including health.

Sahara-v2 also processes audio from busy call center environments, such as deployments with partner ARM, filtering through background noise and overlapping speakers to deliver accurate transcripts. In these conditions, the model performs over 20% better in robustness than other major speech models.

Build anything

Sahara-v2 makes any voice application possible for African contexts. It handles complex acoustic variations to enable reliable voice interfaces, testing over 20% better in robustness against background noise, overlapping speakers, and silence compared to competitors.

Sahara-v2 is our most capable voice AI model built for real-world use so far. It makes applications more accessible with specialized capabilities for Voice Bots, Voice Autofill for KYC, application, and admission forms, and Voice Banking. The model supports accurate capture of structured data like names, IDs, and numbers, performing over 35% better where precision matters. It also handles safety-critical voice interactions, performing over 25% better across verticals such as health, legal, and finance.

You can now build with Sahara-v2 using the new streamlined widget for integrations and participate in the Build with Sahara Developer challenge. It is also available with offline support to enable deployment in diverse environments.

Accelerating African voice AI development with Sahara-v2

As voice AI accelerates with Sahara-v2, we are enhancing the developer experience for African voice applications. Today, we are introducing a new streamlined widget for integrations and launching the Build with Sahara Developer challenge to help developers build and deploy solutions faster.

Sahara-v2’s advanced speech understanding and offline support give developers the tools to build robust voice applications. They can now tap into capabilities such as Voice Bots (available in 7 languages and accented English), Voice Autofill for KYC, application, and admission forms, and Voice Banking for command-driven fintech interactions. These features are built to handle complex African contexts, backed by Sahara-v2’s performance across 23 new African languages.

Connect anything

Sahara-v2 marks a significant step forward in speech recognition, particularly in how it handles challenging audio environments. It tests over 20% better than competitors like Gemini-3 and Azure when it comes to background noise, overlapping speakers, and silence. It handles natural language complexity through the world’s first bilingual Swahili-English ASR model and voice bot support across seven languages, and enables you to connect any application or workflow through reliable voice interaction.

Building Sahara-v2 for critical domains

The model is built for critical domains, performing over 25% better across verticals, including health, legal, and finance. To support these sectors, we are introducing Voice Banking for command-driven fintech interactions and Voice Autofill for processing sensitive data in KYC, application, and admission forms.

Sahara-v2 has been validated by partners in high-stakes, real-world environments. This includes legal applications with the Ogun State High Court and Yobe State High Court, as well as healthcare implementations with Penda Health (Kenya) and Data.FI (Eswatini & Nigeria).

The Sahara-v2 Era Begins: A New Milestone in African Innovation

The Sahara-v2 era starts today. Here is what is rolling out:

For Enterprises:

Higher accuracy where it actually matters
Fewer errors, retries, and manual corrections
Better customer experience and operational efficiency

For Developers:

Models that finally work for African users
Clear APIs, strong documentation, real benchmarks

For Investors & Partners:

Proof that region-specific AI wins in real markets
Defensible data, infrastructure, and deployment advantage

For Africa:

Technology that listens, understands, and includes

Join the Build with Sahara Developer Challenge

We are inviting developers across Africa to build the next generation of voice-enabled applications with Sahara-v2. The new streamlined widget for integrations and our offline support are now available to help you build robust solutions.

What you can build:

Real impact across Africa

Humanizing Automated Support in Financial Services

ARM is leveraging Sahara-v2 to optimize customer service operations within its call centers. The system is specifically validated for high-traffic environments, designed to maintain clarity even when dealing with background noise and overlapping speakers. By utilizing specialized accented English models, ARM ensures that diverse customer voices are understood accurately, providing a more reliable automated support experience than standard global alternatives.

Letting Doctors Be Doctors Again

Penda Health relies on the world’s first bilingual Swahili-English model to capture patient interactions, a critical feature for environments where speakers naturally switch between languages (code-switching). These healthcare deployments are supported by Voice Autofill, which helps streamline admission forms and ensures that domain-specific medical terminology is captured with high precision.

The Nigerian Judiciary: Protecting the Integrity of the Record

The Ogun State High Court and Yobe State High Court in Nigeria have adopted Sahara-v2 to support legal transcription. To meet strict requirements for privacy and reliability, these courts utilize the system’s offline capabilities, allowing sensitive proceedings to be processed locally without depending on internet connectivity. The technology is specifically optimized to recognize African names, locations, and organizations, ensuring that the judicial record remains accurate and preserves local context.

Availability

Sahara-v2 is available now for developers, enterprises, and organizations across Africa.

Sahara v2: When Voice AI
Finally Understands Africa

The Problem: When Global Models Meet African Speech

Sahara v2 (ASR)

Sahara TTS

Multilingual African Speech (Afrivox Benchmark)

African Accented English (Industry-Specific Benchmarks)

Industry-Leading Noise & Silence Robustness

Build With Sahara – Developer Challenge

Speech Recognition That Finally Understands Africa

Privacy

Speech recognition with exceptional accuracy and depth

Highlights

Overall Performance

African Names and Entities

Numerical & Structured Data

Robustness in Challenging Audio

Performance Across Verticals

Language Coverage and Multilingual Performance

Sahara-v2 helps you transcribe, understand, build, and connect anything

Building Sahara-v2 for critical domains