Updated 9 February 2026 at 16:01 IST

SarvamAI’s Daily Drops: Bulbul V3, Sarvam Vision, Samvaad, Audio, and Dub

SarvamAI unveils Bulbul V3, Sarvam Vision, Samvaad, Sarvam Audio, and Sarvam Dub — redefining AI benchmarks for Indian languages. Co-founder Pratyush Kumar highlights India’s ambition to build across the full AI stack, from models to real-world applications.

Tech News
4 min read

Follow :

SarvamAI’s Daily Drops: Bulbul V3, Sarvam Vision, Samvaad, Audio, and Dub | Image: Sarvam AI

New Delhi: Over the past two weeks, Pratyush Kumar, co-founder of SarvamAI, has rolled out a series of announcements that together showcase the company’s vision to build across the “full stack” of artificial intelligence ranging from compute and data to models and applications.

Bulbul V3: Raising the Bar in Text-to-Speech

Sarvam unveiled Bulbul V3, its latest text-to-speech model, which blends impeccable diction with human-like naturalness. Independent third-party listening studies found Bulbul V3 to deliver the highest listener preference and lowest error rates across languages and use cases. Stress tests on numerics, technical content, and named entities confirmed its robustness. To encourage adoption, Sarvam has opened unlimited usage through February, inviting developers and enterprises to experiment widely.

Sarvam Vision: A Leap in Multilingual Digitisation

The company introduced Sarvam Vision, a 3 billion parameter state-space vision-language model. Competitive with leading English digitisation systems, it sets a new benchmark for Indian languages, supporting all 22 scheduled Indian languages. On Indian language tasks, Sarvam Vision has outperformed global peers, underscoring the company’s focus on inclusivity.

Samvaad: Conversational Agents at Scale

Sarvam’s Samvaad platform has scaled to over a million minutes of interactions daily, powering use cases from population-scale outreach campaigns to hybrid onboarding journeys over phone and WhatsApp, and 24/7 sales assistants. Each month, the company reports expanding TAM estimates as new applications emerge.

Sarvam Audio: Benchmarking Speech Recognition

The newly launched Sarvam Audio model sets fresh benchmarks in speech recognition for Indian languages. It has significantly outperformed Gemini 3 and GPT‑4o Transcribe across a range of benchmarks, highlighting Sarvam’s ability to compete with global leaders while tailoring solutions for India’s linguistic diversity.

Sarvam Dub: Real-Time Dubbing Breakthrough

In a first-of-its-kind deployment, Sarvam powered the Finance Minister’s budget speech broadcast live in multiple languages on Republic TV. The system achieved latency under two minutes while retaining speaker similarity, reaching millions of homes and setting a new standard for real-time dubbing.

Blending Culture with Code

Reflecting its ethos of building across the full stack, Sarvam even composed its own soundtrack for the Bulbul V3 launch, merging technology with cultural expression.

The Road Ahead

Together, these announcements highlight SarvamAI’s ambition to define India’s most ambitious decade of technological building. By pushing boundaries in speech, vision, audio, and dubbing, the company is not only competing with global benchmarks but also setting new standards for Indian languages and population-scale applications.

Published By : Priya Pathak

Published On: 9 February 2026 at 16:01 IST

Download the all-new Republic app: