• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Thursday, November 13, 2025
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Technology And Software

Meta returns to open source AI with Omnilingual ASR models that can transcribe 1,600+ languages natively

Josh by Josh
November 11, 2025
in Technology And Software
0
Meta returns to open source AI with Omnilingual ASR models that can transcribe 1,600+ languages natively
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter



Meta has just released a new multilingual automatic speech recognition (ASR) system supporting 1,600+ languages — dwarfing OpenAI’s open source Whisper model, which supports just 99.

READ ALSO

Weibo's new open source AI model VibeThinker-1.5B outperforms DeepSeek-R1 on $7,800 post-training budget

Our favorite 2025 advent calendars from Lego, Pokémon, Funko Pop, Magna-Tiles and more

Is architecture also allows developers to extend that support to thousands more. Through a feature called zero-shot in-context learning, users can provide a few paired examples of audio and text in a new language at inference time, enabling the model to transcribe additional utterances in that language without any retraining.

In practice, this expands potential coverage to more than 5,400 languages — roughly every spoken language with a known script.

It’s a shift from static model capabilities to a flexible framework that communities can adapt themselves. So while the 1,600 languages reflect official training coverage, the broader figure represents Omnilingual ASR’s capacity to generalize on demand, making it the most extensible speech recognition system released to date.

Best of all: it's been open sourced under a plain Apache 2.0 license — not a restrictive, quasi open-source Llama license like the company's prior releases, which limited use by larger enterprises unless they paid licensing fees — meaning researchers and developers are free to take and implement it right away, for free, without restrictions, even in commercial and enterprise-grade projects!

Released on November 10 on Meta's website, Github, along with a demo space on Hugging Face and technical paper, Meta’s Omnilingual ASR suite includes a family of speech recognition models, a 7-billion parameter multilingual audio representation model, and a massive speech corpus spanning over 350 previously underserved languages.

All resources are freely available under open licenses, and the models support speech-to-text transcription out of the box.

“By open sourcing these models and dataset, we aim to break down language barriers, expand digital access, and empower communities worldwide,” Meta posted on its @AIatMeta account on X

Designed for Speech-to-Text Transcription

At its core, Omnilingual ASR is a speech-to-text system.

The models are trained to convert spoken language into written text, supporting applications like voice assistants, transcription tools, subtitles, oral archive digitization, and accessibility features for low-resource languages.

Unlike earlier ASR models that required extensive labeled training data, Omnilingual ASR includes a zero-shot variant.

This version can transcribe languages it has never seen before—using just a few paired examples of audio and corresponding text.

This lowers the barrier for adding new or endangered languages dramatically, removing the need for large corpora or retraining.

Model Family and Technical Design

The Omnilingual ASR suite includes multiple model families trained on more than 4.3 million hours of audio from 1,600+ languages:

  • wav2vec 2.0 models for self-supervised speech representation learning (300M–7B parameters)

  • CTC-based ASR models for efficient supervised transcription

  • LLM-ASR models combining a speech encoder with a Transformer-based text decoder for state-of-the-art transcription

  • LLM-ZeroShot ASR model, enabling inference-time adaptation to unseen languages

All models follow an encoder–decoder design: raw audio is converted into a language-agnostic representation, then decoded into written text.

Why the Scale Matters

While Whisper and similar models have advanced ASR capabilities for global languages, they fall short on the long tail of human linguistic diversity. Whisper supports 99 languages. Meta’s system:

  • Directly supports 1,600+ languages

  • Can generalize to 5,400+ languages using in-context learning

  • Achieves character error rates (CER) under 10% in 78% of supported languages

Among those supported are more than 500 languages never previously covered by any ASR model, according to Meta’s research paper.

This expansion opens new possibilities for communities whose languages are often excluded from digital tools

Here’s the revised and expanded background section, integrating the broader context of Meta’s 2025 AI strategy, leadership changes, and Llama 4’s reception, complete with in-text citations and links:

Background: Meta’s AI Overhaul and a Rebound from Llama 4

The release of Omnilingual ASR arrives at a pivotal moment in Meta’s AI strategy, following a year marked by organizational turbulence, leadership changes, and uneven product execution.

Omnilingual ASR is the first major open-source model release since the rollout of Llama 4, Meta’s latest large language model, which debuted in April 2025 to mixed and ultimately poor reviews, with scant enterprise adoption compared to Chinese open source model competitors.

The failure led Meta founder and CEO Mark Zuckerberg to appoint Alexandr Wang, co-founder and prior CEO of AI data supplier Scale AI, as Chief AI Officer, and embark on an extensive and costly hiring spree that shocked the AI and business communities with eye-watering pay packages for top AI researchers.

In contrast, Omnilingual ASR represents a strategic and reputational reset. It returns Meta to a domain where the company has historically led — multilingual AI — and offers a truly extensible, community-oriented stack with minimal barriers to entry.

The system’s support for 1,600+ languages and its extensibility to over 5,000 more via zero-shot in-context learning reassert Meta’s engineering credibility in language technology.

Importantly, it does so through a free and permissively licensed release, under Apache 2.0, with transparent dataset sourcing and reproducible training protocols.

This shift aligns with broader themes in Meta’s 2025 strategy. The company has refocused its narrative around a “personal superintelligence” vision, investing heavily in infrastructure (including a September release of custom AI accelerators and Arm-based inference stacks) source while downplaying the metaverse in favor of foundational AI capabilities. The return to public training data in Europe after a regulatory pause also underscores its intention to compete globally, despite privacy scrutiny source.

Omnilingual ASR, then, is more than a model release — it’s a calculated move to reassert control of the narrative: from the fragmented rollout of Llama 4 to a high-utility, research-grounded contribution that aligns with Meta’s long-term AI platform strategy.

Community-Centered Dataset Collection

To achieve this scale, Meta partnered with researchers and community organizations in Africa, Asia, and elsewhere to create the Omnilingual ASR Corpus, a 3,350-hour dataset across 348 low-resource languages. Contributors were compensated local speakers, and recordings were gathered in collaboration with groups like:

  • African Next Voices: A Gates Foundation–supported consortium including Maseno University (Kenya), University of Pretoria, and Data Science Nigeria

  • Mozilla Foundation’s Common Voice, supported through the Open Multilingual Speech Fund

  • Lanfrica / NaijaVoices, which created data for 11 African languages including Igala, Serer, and Urhobo

The data collection focused on natural, unscripted speech. Prompts were designed to be culturally relevant and open-ended, such as “Is it better to have a few close friends or many casual acquaintances? Why?” Transcriptions used established writing systems, with quality assurance built into every step.

Performance and Hardware Considerations

The largest model in the suite, the omniASR_LLM_7B, requires ~17GB of GPU memory for inference, making it suitable for deployment on high-end hardware. Smaller models (300M–1B) can run on lower-power devices and deliver real-time transcription speeds.

Performance benchmarks show strong results even in low-resource scenarios:

  • CER <10% in 95% of high-resource and mid-resource languages

  • CER <10% in 36% of low-resource languages

  • Robustness in noisy conditions and unseen domains, especially with fine-tuning

The zero-shot system, omniASR_LLM_7B_ZS, can transcribe new languages with minimal setup. Users provide a few sample audio–text pairs, and the model generates transcriptions for new utterances in the same language.

Open Access and Developer Tooling

All models and the dataset are licensed under permissive terms:

  • Apache 2.0 for models and code

  • CC-BY 4.0 for the Omnilingual ASR Corpus on HuggingFace

Installation is supported via PyPI and uv:

pip install omnilingual-asr

Meta also provides:

  • A HuggingFace dataset integration

  • Pre-built inference pipelines

  • Language-code conditioning for improved accuracy

Developers can view the full list of supported languages using the API:

from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs

print(len(supported_langs))
print(supported_langs)

Broader Implications

Omnilingual ASR reframes language coverage in ASR from a fixed list to an extensible framework. It enables:

  • Community-driven inclusion of underrepresented languages

  • Digital access for oral and endangered languages

  • Research on speech tech in linguistically diverse contexts

Crucially, Meta emphasizes ethical considerations throughout—advocating for open-source participation and collaboration with native-speaking communities.

“No model can ever anticipate and include all of the world’s languages in advance,” the Omnilingual ASR paper states, “but Omnilingual ASR makes it possible for communities to extend recognition with their own data.”

Access the Tools

All resources are now available at:

  • Code + Models: github.com/facebookresearch/omnilingual-asr

  • Dataset: huggingface.co/datasets/facebook/omnilingual-asr-corpus

  • Blogpost: ai.meta.com/blog/omnilingual-asr

What This Means for Enterprises

For enterprise developers, especially those operating in multilingual or international markets, Omnilingual ASR significantly lowers the barrier to deploying speech-to-text systems across a broader range of customers and geographies.

Instead of relying on commercial ASR APIs that support only a narrow set of high-resource languages, teams can now integrate an open-source pipeline that covers over 1,600 languages out of the box—with the option to extend it to thousands more via zero-shot learning.

This flexibility is especially valuable for enterprises working in sectors like voice-based customer support, transcription services, accessibility, education, or civic technology, where local language coverage can be a competitive or regulatory necessity. Because the models are released under the permissive Apache 2.0 license, businesses can fine-tune, deploy, or integrate them into proprietary systems without restrictive terms.

It also represents a shift in the ASR landscape—from centralized, cloud-gated offerings to community-extendable infrastructure. By making multilingual speech recognition more accessible, customizable, and cost-effective, Omnilingual ASR opens the door to a new generation of enterprise speech applications built around linguistic inclusion rather than linguistic limitation.



Source_link

Related Posts

Weibo's new open source AI model VibeThinker-1.5B outperforms DeepSeek-R1 on $7,800 post-training budget
Technology And Software

Weibo's new open source AI model VibeThinker-1.5B outperforms DeepSeek-R1 on $7,800 post-training budget

November 13, 2025
Our favorite 2025 advent calendars from Lego, Pokémon, Funko Pop, Magna-Tiles and more
Technology And Software

Our favorite 2025 advent calendars from Lego, Pokémon, Funko Pop, Magna-Tiles and more

November 13, 2025
DHS Kept Chicago Police Records for Months in Violation of Domestic Espionage Rules
Technology And Software

DHS Kept Chicago Police Records for Months in Violation of Domestic Espionage Rules

November 13, 2025
‘Chad: The Brainrot IDE’ is a new Y Combinator-backed product so wild, people thought it was fake
Technology And Software

‘Chad: The Brainrot IDE’ is a new Y Combinator-backed product so wild, people thought it was fake

November 13, 2025
Is Business Central Same as Dynamics 365 CRM or ERP?
Technology And Software

Is Business Central Same as Dynamics 365 CRM or ERP?

November 12, 2025
Baidu just dropped an open-source multimodal AI that it claims beats GPT-5 and Gemini
Technology And Software

Baidu just dropped an open-source multimodal AI that it claims beats GPT-5 and Gemini

November 12, 2025
Next Post

3 steps to building a workforce that thinks with AI

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR NEWS

Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
7 Best EOR Platforms for Software Companies in 2025

7 Best EOR Platforms for Software Companies in 2025

June 21, 2025

EDITOR'S PICK

Preparation is the real response in crisis comms

October 13, 2025
Building Unshakeable Brand Loyalty: A PR Strategy Guide for Supplement Companies

Building Unshakeable Brand Loyalty: A PR Strategy Guide for Supplement Companies

September 20, 2025
Future Horizons May Report on Semiconductor Trends

Future Horizons May Report on Semiconductor Trends

June 4, 2025
Scientists Unveil AI That Runs on Light, Not Power-Hungry Chips

Scientists Unveil AI That Runs on Light, Not Power-Hungry Chips

September 18, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Gamification In Financial Literacy: Trends And Examples
  • After‑School Care That Boosts Academic Success
  • Weibo's new open source AI model VibeThinker-1.5B outperforms DeepSeek-R1 on $7,800 post-training budget
  • Talk to Your TV — Bitmovin’s Agentic AI Hub Quietly Redefines How We Watch
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?