• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Thursday, April 30, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

NVIDIA AI Just Released the Largest Open-Source Speech AI Dataset and State-of-the-Art Models for European Languages

Josh by Josh
August 16, 2025
in Al, Analytics and Automation
0
NVIDIA AI Just Released the Largest Open-Source Speech AI Dataset and State-of-the-Art Models for European Languages


Nvidia has taken a major leap in the development of multilingual speech AI, unveiling Granary, the largest open-source speech dataset for European languages, and two state-of-the-art models: Canary-1b-v2 and Parakeet-tdt-0.6b-v3. This release sets a new standard for accessible, high-quality resources in automatic speech recognition (ASR) and speech translation (AST), especially for underrepresented European languages.

Granary: The Foundation of Multilingual Speech AI

Granary is a massive, multilingual corpus developed in collaboration with Carnegie Mellon University and Fondazione Bruno Kessler. It delivers around one million hours of audio, with 650,000 hours for speech recognition and 350,000 for speech translation. The dataset covers 25 European languages—representing nearly all official EU languages, plus Russian and Ukrainian—with a critical focus on languages with limited annotated data, such as Croatian, Estonian, and Maltese.

READ ALSO

DeepSeek’s new AI model is rolling out quietly, not to the Wall Street market shock

Solving the “Whac-a-mole dilemma”: A smarter way to debias AI vision models | MIT News

Key features:

  • Largest open-source speech dataset for 25 European languages.
  • Pseudo-labeling pipeline: Unlabeled public audio data is processed using Nvidia NeMo’s Speech Data Processor, which adds structure and enhances quality, reducing the need for resource-intensive manual annotation.
  • Supports both ASR and AST: Designed for transcription and translation tasks.
  • Open access: Available to the global developer community for flexible, production-scale model training.

By leveraging clean, high-quality data, Granary enables significantly faster model convergence. Research demonstrates that developers need half as much Granary data to reach target accuracies compared to competing datasets, making it especially valuable for resource-constrained languages and rapid prototyping.

Canary-1b-v2: Multilingual ASR + Translation (En ↔ 24 Languages)

Canary-1b-v2 is a billion-parameter Encoder-Decoder model trained on Granary, delivering high-quality transcription and translation between English and 24 supported European languages.

It’s architected for accuracy and multitask capabilities:

  • Languages supported: 25 European languages, doubling Canary’s coverage from 4.
  • State-of-the-art performance: Comparable accuracy to models three times larger, but up to 10× faster inference.
  • Multitask capability: Robust across both ASR and AST tasks.
  • Features: Automatic punctuation, capitalization, word and segment-level timestamps—even timestamped translated outputs.
  • Architecture: FastConformer Encoder with Transformer Decoder; unified vocabulary for all languages via SentencePiece tokenizer.
  • Robustness: Maintains strong performance under noisy conditions and resists output hallucinations.

Evaluation highlights:

  • ASR Word Error Rate (WER): 7.15% (AMI dataset), 10.82% (LibriSpeech Clean).
  • AST COMET Scores: 79.3 (X→English), 84.56 (English→X).
  • Deployment: Available under CC BY 4.0 license; optimized for Nvidia GPU-accelerated systems, enabling fast training and inference for scalable production use.

Parakeet-tdt-0.6b-v3: Real-Time Multilingual ASR

Parakeet-tdt-0.6b-v3 is a 600-million-parameter multilingual ASR model designed for high-throughput or large-volume transcription in all 25 supported languages. It extends the Parakeet family (previously English-centric) to full European coverage.

  • Automatic language detection: Transcribes input audio without needing extra prompts.
  • Real-time capability: Efficiently transcribes up to 24-minute audio segments in a single inference pass.
  • Fast, scalable, and commercial-ready: Prioritizes low latency, batch processing, and accurate outputs, with word-level timestamps, punctuation, and capitalization.
  • Robustness: Reliable even on complex content (numbers, lyrics) and challenging audio conditions.

Impact on Speech AI Development

Nvidia’s Granary dataset and model suite accelerate the democratization of speech AI for Europe, enabling scalable development of:

  • Multilingual chatbots
  • Customer service voice agents
  • Near-real-time translation services

Developers, researchers, and businesses can now build inclusive, high-quality applications supporting linguistic diversity, with open access to these super cool models and datasets


Check out the Granary, NVIDIA Canary-1b-v2 and NVIDIA Parakeet-tdt-0.6b-v3. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.


Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.



Source_link

Related Posts

DeepSeek’s new AI model is rolling out quietly, not to the Wall Street market shock
Al, Analytics and Automation

DeepSeek’s new AI model is rolling out quietly, not to the Wall Street market shock

April 30, 2026
Solving the “Whac-a-mole dilemma”: A smarter way to debias AI vision models | MIT News
Al, Analytics and Automation

Solving the “Whac-a-mole dilemma”: A smarter way to debias AI vision models | MIT News

April 30, 2026
IBM Releases Two Granite Speech 4.1 2B Models: Autoregressive ASR with Translation and Non-Autoregressive Editing for Fast Inference
Al, Analytics and Automation

IBM Releases Two Granite Speech 4.1 2B Models: Autoregressive ASR with Translation and Non-Autoregressive Editing for Fast Inference

April 30, 2026
How AI Policy in South Africa Is Ruining Itself
Al, Analytics and Automation

How AI Policy in South Africa Is Ruining Itself

April 30, 2026
The MIT-IBM Computing Research Lab launches to shape the future of AI and quantum computing | MIT News
Al, Analytics and Automation

The MIT-IBM Computing Research Lab launches to shape the future of AI and quantum computing | MIT News

April 29, 2026
Meta FAIR Releases NeuralSet: A Python Package for Neuro-AI That Supports fMRI, M/EEG, Spikes, and HuggingFace Embeddings
Al, Analytics and Automation

Meta FAIR Releases NeuralSet: A Python Package for Neuro-AI That Supports fMRI, M/EEG, Spikes, and HuggingFace Embeddings

April 29, 2026
Next Post
Apple Finally Destroyed Steve Jobs’ Vision of the iPad. Good

Apple Finally Destroyed Steve Jobs’ Vision of the iPad. Good

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

Kayak and Expedia race to build AI travel agents that turn social posts into itineraries

Kayak and Expedia race to build AI travel agents that turn social posts into itineraries

July 1, 2025
CES 2026 Trends: Functional AI, Purpose-Driven Robotics, and the Power of Communications

CES 2026 Trends: Functional AI, Purpose-Driven Robotics, and the Power of Communications

January 15, 2026

Insights from Airtel Digital on The Digital Commerce Revolution: Phygital Is the New Normal

August 5, 2025
What’s new in Gemini Code Assist

What’s new in Gemini Code Assist

August 21, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • How to Choose a Tech PR Agency in 2026
  • Drill Kicks Location in Goat Simulator 3
  • AI SEO for the Next Generation of Search: Strategies & Benefits
  • Expert Picks for Risk Reduction
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions