• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Friday, January 23, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

Liquid AI’s LFM2-2.6B-Exp Uses Pure Reinforcement Learning RL And Dynamic Hybrid Reasoning To Tighten Small Model Behavior

Josh by Josh
December 28, 2025
in Al, Analytics and Automation
0
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


Liquid AI has introduced LFM2-2.6B-Exp, an experimental checkpoint of its LFM2-2.6B language model that is trained with pure reinforcement learning on top of the existing LFM2 stack. The goal is simple, improve instruction following, knowledge tasks, and math for a small 3B class model that still targets on device and edge deployment.

Where LFM2-2.6B-Exp Fits in the LFM2 Family?

LFM2 is the second generation of Liquid Foundation Models. It is designed for efficient deployment on phones, laptops, and other edge devices. Liquid AI describes LFM2 as a hybrid model that combines short range LIV convolution blocks with grouped query attention blocks, controlled by multiplicative gates.

READ ALSO

Quality Data Annotation for Cardiovascular AI

A Missed Forecast, Frayed Nerves and a Long Trip Back

The family includes 4 dense sizes, LFM2-350M, LFM2-700M, LFM2-1.2B, and LFM2-2.6B. All share a context length of 32,768 tokens, a vocabulary size of 65,536, and bfloat16 precision. The 2.6B model uses 30 layers, with 22 convolution layers and 8 attention layers. Each size is trained on a 10 trillion token budget.

LFM2-2.6B is already positioned as a high efficiency model. It reaches 82.41 percent on GSM8K and 79.56 percent on IFEval. This places it ahead of several 3B class models such as Llama 3.2 3B Instruct, Gemma 3 4B it, and SmolLM3 3B on these benchmarks.

LFM2-2.6B-Exp keeps this architecture. It reuses the same tokenization, context window, and hardware profile. The checkpoint focuses only on changing behavior through a reinforcement learning stage.

https://huggingface.co/LiquidAI/LFM2-2.6B-Exp

Pure RL on Top of a Pretrained, Aligned Base

This checkpoint is built on LFM2-2.6B using pure reinforcement learning. It is specifically trained on instruction following, knowledge, and math.

The underlying LFM2 training stack combines several stages. It includes very large scale supervised fine tuning on a mix of downstream tasks and general domains, custom Direct Preference Optimization with length normalization, iterative model merging, and reinforcement learning with verifiable rewards.

But exactly ‘pure reinforcement learning’ means? LFM2-2.6B-Exp starts from the existing LFM2-2.6B checkpoint and then goes through a sequential RL training schedule. It begin with instruction following, then extend RL training to knowledge oriented prompts, math, and a small amount of tool use, without an additional SFT warm up or distillation step in that final phase.

The important point is that LFM2-2.6B-Exp does not change the base architecture or pre training. It changes the policy through an RL stage that uses verifiable rewards, on a targeted set of domains, on top of a model that is already supervised and preference aligned.

Benchmark Signal, Especially On IFBench

Liquid AI team highlights IFBench as the main headline metric. IFBench is an instruction following benchmark that checks how reliably a model follows complex, constrained instructions. On this benchmark, LFM2-2.6B-Exp surpasses DeepSeek R1-0528, which is reported as 263 times larger in parameter count.

LFM2 models provide strong performance across a standard set of benchmarks such as MMLU, GPQA, IFEval, GSM8K, and related suites. The 2.6B base model already competes well in the 3B segment. The RL checkpoint then pushes instruction following and math further, while staying in the same 3B parameter budget.

Architecture and Capabilities that Matters

The architecture uses 10 double gated short range LIV convolution blocks and 6 grouped query attention blocks, arranged in a hybrid stack. This design reduces KV cache cost and keeps inference fast on consumer GPUs and NPUs.

The pre training mixture uses roughly 75 percent English, 20 percent multilingual data, and 5 percent code. The supported languages include English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.

LFM2 models expose a ChatML like template and native tool use tokens. Tools are described as JSON between dedicated tool list markers. The model then emits Python like calls between tool call markers and reads tool responses between tool response markers. This structure makes the model suitable as the agent core for tool calling stacks without custom prompt engineering.

LFM2-2.6B, and by extension LFM2-2.6B-Exp, is also the only model in the family that enables dynamic hybrid reasoning through special think tokens for complex or multilingual inputs. That capability remains available because the RL checkpoint does not change tokenization or architecture.

Key Takeaways

  1. LFM2-2.6B-Exp is an experimental checkpoint of LFM2-2.6B that adds a pure reinforcement learning stage on top of a pretrained, supervised and preference aligned base, targeted at instruction following, knowledge tasks, and math.
  2. The LFM2-2.6B backbone uses a hybrid architecture that combines double gated short range LIV convolution blocks and grouped query attention blocks, with 30 layers, 22 convolution layers and 8 attention layers, 32,768 token context length, and a 10 trillion token training budget at 2.6B parameters.
  3. LFM2-2.6B already achieves strong benchmark scores in the 3B class, around 82.41 percent on GSM8K and 79.56 percent on IFEval, and the LFM2-2.6B-Exp RL checkpoint further improves instruction following and math performance without changing the architecture or memory profile.
  4. Liquid AI reports that on IFBench, an instruction following benchmark, LFM2-2.6B-Exp surpasses DeepSeek R1-0528 even though the latter has many more parameters, which shows a strong performance per parameter for constrained deployment settings.
  5. LFM2-2.6B-Exp is released on Hugging Face with open weights under the LFM Open License v1.0 and is supported through Transformers, vLLM, llama.cpp GGUF quantizations, and ONNXRuntime, making it suitable for agentic systems, structured data extraction, retrieval augmented generation, and on device assistants where a compact 3B model is required.

Check out the Model here. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.


Max is an AI analyst at MarkTechPost, based in Silicon Valley, who actively shapes the future of technology. He teaches robotics at Brainvyne, combats spam with ComplyEmail, and leverages AI daily to translate complex tech advancements into clear, understandable insights



Source_link

Related Posts

Quality Data Annotation for Cardiovascular AI
Al, Analytics and Automation

Quality Data Annotation for Cardiovascular AI

January 23, 2026
A Missed Forecast, Frayed Nerves and a Long Trip Back
Al, Analytics and Automation

A Missed Forecast, Frayed Nerves and a Long Trip Back

January 23, 2026
Microsoft Releases VibeVoice-ASR: A Unified Speech-to-Text Model Designed to Handle 60-Minute Long-Form Audio in a Single Pass
Al, Analytics and Automation

Microsoft Releases VibeVoice-ASR: A Unified Speech-to-Text Model Designed to Handle 60-Minute Long-Form Audio in a Single Pass

January 23, 2026
Slow Down the Machines? Wall Street and Silicon Valley at Odds Over A.I.’s Nearest Future
Al, Analytics and Automation

Slow Down the Machines? Wall Street and Silicon Valley at Odds Over A.I.’s Nearest Future

January 22, 2026
Inworld AI Releases TTS-1.5 For Realtime, Production Grade Voice Agents
Al, Analytics and Automation

Inworld AI Releases TTS-1.5 For Realtime, Production Grade Voice Agents

January 22, 2026
FlashLabs Researchers Release Chroma 1.0: A 4B Real Time Speech Dialogue Model With Personalized Voice Cloning
Al, Analytics and Automation

FlashLabs Researchers Release Chroma 1.0: A 4B Real Time Speech Dialogue Model With Personalized Voice Cloning

January 22, 2026
Next Post
Craft Food Roblox Blueberry Churros Recipe

Craft Food Roblox Blueberry Churros Recipe

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Google announced the next step in its nuclear energy plans 

Google announced the next step in its nuclear energy plans 

August 20, 2025

EDITOR'S PICK

Robot butlers look more like Roombas than Rosey from the Jetsons

Robot butlers look more like Roombas than Rosey from the Jetsons

January 23, 2026
New Google ‘G’ logo comes to web, widget with Dynamic Color

New Google ‘G’ logo comes to web, widget with Dynamic Color

May 28, 2025
New space startup Lux Aeterna wants to make satellites reusable

New space startup Lux Aeterna wants to make satellites reusable

June 25, 2025
New Branding & Packaging for Cocolab by Wedge — BP&O

New Branding & Packaging for Cocolab by Wedge — BP&O

January 6, 2026

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • FleishmanHillard senior partner on the new rules of crisis spokespersonship
  • The Smile Scroll: How to Market Dental Solutions in a Filtered World
  • Everything in voice AI just changed: how enterprise AI builders can benefit
  • Quality Data Annotation for Cardiovascular AI
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?