• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Saturday, March 14, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Technology And Software

Ai2's new Olmo 3.1 extends reinforcement learning training for stronger reasoning benchmarks

Josh by Josh
December 13, 2025
in Technology And Software
0
Ai2's new Olmo 3.1 extends reinforcement learning training for stronger reasoning benchmarks



The Allen Institute for AI (Ai2) recently released what it calls its most powerful family of models yet, Olmo 3. But the company kept iterating on the models, expanding its reinforcement learning (RL) runs, to create Olmo 3.1.

READ ALSO

Y Combinator-backed Random Labs launches Slate V1, claiming the first 'swarm-native' coding agent

OpenAI reportedly plans to add Sora video generation to ChatGPT

The new Olmo 3.1 models focus on efficiency, transparency, and control for enterprises. 

Ai2 updated two of the three versions of Olmo 2: Olmo 3.1 Think 32B, the flagship model optimized for advanced research, and Olmo 3.1 Instruct 32B, designed for instruction-following, multi-turn dialogue, and tool use. 

Olmo 3 has a third version, Olmo 3-Base for programming, comprehension, and math. It also works well for continue fine-tuning. 

Ai2 said that to upgrade Olmo 3 Think 32B to Olmo 3.1, its researchers extended its best RL run with a longer training schedule. 

“After the original Olmo 3 launch, we resumed our RL training run for Olmo 3 32B Think, training for an additional 21 days on 224 GPUs with extra epochs over our Dolci-Think-RL dataset,” Ai2 said in a blog post. “This yielded Olmo 3.1 32B Think, which brings substantial gains across math, reasoning, and instruction-following benchmarks: improvements of 5+ points on AIME, 4+ points on ZebraLogic, 4+ points on IFEval, and 20+ points on IFBench, alongside stronger performance on coding and complex multi-step tasks.”

To get to Olmo 3.1 Instruct, Ai2 said its researchers applied the recipe behind the smaller Instruct size, 7B, to the larger model.

Olmo 3.1 Instruct 32B is "optimized for chat, tool use, & multi-turn dialogue—making it a much more performant sibling of Olmo 3 Instruct 7B and ready for real-world applications,” Ai2 said in a post on X. 

For now, the new checkpoints are available on the Ai2 Playground or Hugging Face, with API access coming soon. 

Better performance on benchmarks

The Olmo 3.1 models performed well on benchmark tests, predictably beating the Olmo 3 models. 

Olmo 3.1 Think outperformed Qwen 3 32B models in the AIME 2025 benchmark and performed close to Gemma 27B. 

Olmo 3.1 Instruct performed strongly against its open-source peers, even beating models like Gemma 3 on the Math benchmark.

“As for Olmo 3.1 32B Instruct, it’s a larger-scale instruction-tuned model built for chat, tool use, and multi-turn dialogue. Olmo 3.1 32B Instruct is our most capable fully open chat model to date and — in our evaluations — the strongest fully open 32B-scale instruct model,” the company said. 

Ai2 also upgraded its RL-Zero 7B models for math and coding. The company said on X that both models benefited from longer and more stable training runs.

Commitment to transparency and open source 

Ai2 previously told VentureBeat that it designed the Olmo 3 family of models to offer enterprises and research labs more control and understanding of the data and training that went into the model. 

Organizations could add to the model’s data mix and retrain it to also learn from what’s been added.  

This has long been a commitment for Ai2, which also offers a tool called OlmoTrace that tracks how LLM outputs match its training data.  

“Together, Olmo 3.1 Think 32B and Olmo 3.1 Instruct 32B show that openness and performance can advance together. By extending the same model flow, we continue to improve capabilities while retaining end-to-end transparency over data, code, and training decisions,” Ai2 said. 



Source_link

Related Posts

Y Combinator-backed Random Labs launches Slate V1, claiming the first 'swarm-native' coding agent
Technology And Software

Y Combinator-backed Random Labs launches Slate V1, claiming the first 'swarm-native' coding agent

March 14, 2026
OpenAI reportedly plans to add Sora video generation to ChatGPT
Technology And Software

OpenAI reportedly plans to add Sora video generation to ChatGPT

March 14, 2026
What to Do in Vegas If You’re Here for Business (2026)
Technology And Software

What to Do in Vegas If You’re Here for Business (2026)

March 14, 2026
Nyne, founded by a father-son duo, gives AI agents the human context they’re missing
Technology And Software

Nyne, founded by a father-son duo, gives AI agents the human context they’re missing

March 13, 2026
NanoClaw and Docker partner to make sandboxes the safest way for enterprises to deploy AI agents
Technology And Software

NanoClaw and Docker partner to make sandboxes the safest way for enterprises to deploy AI agents

March 13, 2026
This web app lets you ‘channel surf’ YouTube like a ’90s kid watching cable
Technology And Software

This web app lets you ‘channel surf’ YouTube like a ’90s kid watching cable

March 13, 2026
Next Post
How to Add a Link to an Instagram Story (3 Steps + Examples)

How to Add a Link to an Instagram Story (3 Steps + Examples)

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Google announced the next step in its nuclear energy plans 

Google announced the next step in its nuclear energy plans 

August 20, 2025

EDITOR'S PICK

US tariffs will exempt smartphones and other key tech, for now

US tariffs will exempt smartphones and other key tech, for now

June 2, 2025
Burger King® France Serves Humour with a Side of Holiday Relief

Burger King® France Serves Humour with a Side of Holiday Relief

December 24, 2025
Top 10 Storage Virtualization Solutions

Top 10 Storage Virtualization Solutions

July 22, 2025
Google AI Introduces STATIC: A Sparse Matrix Framework Delivering 948x Faster Constrained Decoding for LLM Based Generative Retrieval

Google AI Introduces STATIC: A Sparse Matrix Framework Delivering 948x Faster Constrained Decoding for LLM Based Generative Retrieval

March 2, 2026

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Immigration, Bankruptcy, and Beyond: Why Singh Law Firm P.A. Takes a Bigger Picture Approach
  • Garry Tan Releases gstack: An Open-Source Claude Code System for Planning, Code Review, QA, and Shipping
  • 7 Best Invoice Management Software for 2026: My Picks
  • Gemini’s task automation is here and it’s wild
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions