• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Monday, June 8, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

Liquid AI Released LFM2-Audio-1.5B: An End-to-End Audio Foundation Model with Sub-100 ms Response Latency

Josh by Josh
October 1, 2025
in Al, Analytics and Automation
0


Liquid AI has released LFM2-Audio-1.5B, a compact audio–language foundation model that both understands and generates speech and text through a single end-to-end stack. It positions itself for low-latency, real-time assistants on resource-constrained devices, extending the LFM2 family into audio while retaining a small footprint.

https://www.liquid.ai/blog/lfm2-audio-an-end-to-end-audio-foundation-model

But what’s actually new? a unified backbone with disentangled audio I/O

LFM2-Audio extends the 1.2B-parameter LFM2 language backbone to treat audio and text as first-class sequence tokens. Crucially, the model disentangles audio representations: inputs are continuous embeddings projected directly from raw waveform chunks (~80 ms), while outputs are discrete audio codes. This avoids discretization artifacts on the input path while keeping training and generation autoregressive for both modalities on the output path.

On the implementation side, the released checkpoint uses:

🚨 [Recommended Read] ViPE (Video Pose Engine): A Powerful and Versatile 3D Video Annotation Tool for Spatial AI
  • Backbone: LFM2 (hybrid conv + attention), 1.2B params (LM only)
  • Audio encoder: FastConformer (~115M, canary-180m-flash)
  • Audio decoder: RQ-Transformer predicting discrete Mimi codec tokens (8 codebooks)
  • Context: 32,768 tokens; vocab: 65,536 (text) / 2049×8 (audio)
  • Precision: bfloat16; license: LFM Open License v1.0; languages: English
https://www.liquid.ai/blog/lfm2-audio-an-end-to-end-audio-foundation-model

Two generation modes for real-time agents

  • Interleaved generation for live, speech-to-speech chat where the model alternates text and audio tokens to minimize perceived latency.
  • Sequential generation for ASR/TTS (switching modalities turn-by-turn).

Liquid AI provides a Python package (liquid-audio) and a Gradio demo to reproduce these behaviors.

Latency: <100 ms to first audio

Liquid AI team reports end-to-end latency below 100 ms from a 4-second audio query to the first audible response—a proxy for perceived responsiveness in interactive use—stating it is faster than models smaller than 1.5B parameters under their setup.

READ ALSO

ClawHub Security Signals: A Coding Guide to End-to-End Security Signal Analysis and Verdict Classification on the AI Skills Dataset

Microsoft AI Introduces MAI-Transcribe-1.5: 2.4% WER on Artificial Analysis, Best-in-Class FLEURS Accuracy, and Up to 5x Faster Long-Audio Transcription

Benchmarks: VoiceBench and ASR results

On VoiceBench—a suite of nine audio-assistant evaluations—Liquid reports an overall score of 56.78 for LFM2-Audio-1.5B, with per-task numbers disclosed in the blog’s chart (e.g., AlpacaEval 3.71, CommonEval 3.49, WildVoice 3.17). The Liquid AI team contrasts this result with larger models like Qwen2.5-Omni-3B and Moshi-7B in the same table. (VoiceBench is an external benchmark introduced in late 2024 for LLM-based voice assistants)

The model card on Hugging Face provides an additional VoiceBench table (with closely related—but not identical—per-task values) and includes classic ASR WERs where LFM2-Audio matches or improves on Whisper-large-v3-turbo for some datasets despite being a generalist speech–text model. For example (lower is better): AMI 15.36 vs. 16.13 (Whisper-large-v3-turbo), LibriSpeech-clean 2.03 vs. 2.10.

https://huggingface.co/LiquidAI/LFM2-Audio-1.5B

Alright, but why does it really matter in voice AI trends?

Most “omni” stacks couple ASR → LLM → TTS, which adds latency and brittle interfaces. LFM2-Audio’s single-backbone design with continuous input embeddings and discrete output codes reduces glue logic and allows interleaved decoding for early audio emission. For developers, this translates to simpler pipelines and faster perceived response times, while still supporting ASR, TTS, classification, and conversational agents from one model. Liquid AI provides code, demo entry points, and distribution via Hugging Face.


Check out the GitHub Page, Hugging Face Model Card and Technical details. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.


Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.

🔥[Recommended Read] NVIDIA AI Open-Sources ViPE (Video Pose Engine): A Powerful and Versatile 3D Video Annotation Tool for Spatial AI



Source_link

Related Posts

ClawHub Security Signals: A Coding Guide to End-to-End Security Signal Analysis and Verdict Classification on the AI Skills Dataset
Al, Analytics and Automation

ClawHub Security Signals: A Coding Guide to End-to-End Security Signal Analysis and Verdict Classification on the AI Skills Dataset

June 8, 2026
Microsoft AI Introduces MAI-Transcribe-1.5: 2.4% WER on Artificial Analysis, Best-in-Class FLEURS Accuracy, and Up to 5x Faster Long-Audio Transcription
Al, Analytics and Automation

Microsoft AI Introduces MAI-Transcribe-1.5: 2.4% WER on Artificial Analysis, Best-in-Class FLEURS Accuracy, and Up to 5x Faster Long-Audio Transcription

June 8, 2026
Building Reflective Prompt Optimization with GEPA: Multi-Component Prompts, Structured Feedback, and Held-Out Validation
Al, Analytics and Automation

Building Reflective Prompt Optimization with GEPA: Multi-Component Prompts, Structured Feedback, and Held-Out Validation

June 7, 2026
Best 21 Low-Code and No-Code AI Tools in 2026
Al, Analytics and Automation

Best 21 Low-Code and No-Code AI Tools in 2026

June 7, 2026
Tod Machover receives George Peabody Medal for contributions to music and technology | MIT News
Al, Analytics and Automation

Tod Machover receives George Peabody Medal for contributions to music and technology | MIT News

June 6, 2026
Moonshot AI Releases Kimi Code CLI: A Terminal AI Coding Agent Built in TypeScript for Next-Gen Agents
Al, Analytics and Automation

Moonshot AI Releases Kimi Code CLI: A Terminal AI Coding Agent Built in TypeScript for Next-Gen Agents

June 6, 2026
Next Post
bbcon 2025: Engage Gen Z Donors with SMS

bbcon 2025: Engage Gen Z Donors with SMS

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

Blackmagic’s free camera app can now stream to YouTube and Twitch

Blackmagic’s free camera app can now stream to YouTube and Twitch

November 8, 2025
Best Tool for Advance Network Testing

Best Tool for Advance Network Testing

July 7, 2025
What Is Event Management? Definition, Tips, Best Practices: A Complete Guide

What Is Event Management? Definition, Tips, Best Practices: A Complete Guide

December 15, 2025
Meet ‘kvcached’: A Machine Learning Library to Enable Virtualized, Elastic KV Cache for LLM Serving on Shared GPUs

Meet ‘kvcached’: A Machine Learning Library to Enable Virtualized, Elastic KV Cache for LLM Serving on Shared GPUs

October 28, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Sharon Srivastava: Leading With Composure Through Presence
  • We don’t know how the Ebola outbreak started. That’s a problem.
  • ClawHub Security Signals: A Coding Guide to End-to-End Security Signal Analysis and Verdict Classification on the AI Skills Dataset
  • Employee Ownership Is Not A Culture Strategy
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions