• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Thursday, March 5, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

YuanLab AI Releases Yuan 3.0 Ultra: A Flagship Multimodal MoE Foundation Model, Built for Stronger Intelligence and Unrivaled Efficiency

Josh by Josh
March 5, 2026
in Al, Analytics and Automation
0
YuanLab AI Releases Yuan 3.0 Ultra: A Flagship Multimodal MoE Foundation Model, Built for Stronger Intelligence and Unrivaled Efficiency


How can a trillion-parameter Large Language Model achieve state-of-the-art enterprise performance while simultaneously cutting its total parameter count by 33.3% and boosting pre-training efficiency by 49%? Yuan Lab AI releases Yuan3.0 Ultra, an open-source Mixture-of-Experts (MoE) large language model featuring 1T total parameters and 68.8B activated parameters. The model architecture is designed to optimize performance in enterprise-specific tasks while maintaining competitive general-purpose capabilities. Unlike traditional dense models, Yuan3.0 Ultra utilizes sparsity to scale capacity without a linear increase in computational cost.

Layer-Adaptive Expert Pruning (LAEP)

The primary innovation in Yuan3.0 Ultra’s training is the Layer-Adaptive Expert Pruning (LAEP) algorithm. While expert pruning is typically applied post-training, LAEP identifies and removes underutilized experts directly during the pre-training stage.

Research into expert load distribution revealed two distinct phases during pre-training:

  1. Initial Transition Phase: Characterized by high volatility in expert loads inherited from random initialization.
  2. Stable Phase: Expert loads converge, and the relative ranking of experts based on token assignment remains largely fixed.

Once the stable phase is reached, LAEP applies pruning based on two constraints:

  • Individual Load Constraint (⍺): Targets experts whose token load is significantly lower than the layer average.
  • Cumulative Load Constraint (β): Identifies the subset of experts contributing the least to total token processing.

By applying LAEP with β=0.1 and varying ⍺, the model was pruned from an initial 1.5T parameters down to 1T parameters. This 33.3% reduction in total parameters preserved the model’s multi-domain performance while significantly lowering memory requirements for deployment. In the 1T configuration, the number of experts per layer was reduced from 64 to a maximum of 48 preserved experts.

https://github.com/Yuan-lab-LLM/Yuan3.0-Ultra/blob/main/Docs/Yuan3.0_Ultra%20Paper.pdf

Hardware Efficiency and Expert Rearrangement

MoE models often suffer from device-level load imbalance when experts are distributed across a computing cluster. To address this, Yuan3.0 Ultra implements an Expert Rearranging algorithm.

This algorithm ranks experts by token load and uses a greedy strategy to distribute them across GPUs so that the cumulative token variance is minimized.

Method TFLOPS per GPU
Base Model (1515B) 62.14
DeepSeek-V3 Aux Loss 80.82
Yuan3.0 Ultra (LAEP) 92.60

Total pre-training efficiency improved by 49%. This improvement is attributed to two factors:

  • Model Pruning: Contributed 32.4% to the efficiency gain.
  • Expert Rearrangement: Contributed 15.9% to the efficiency gain.

Mitigating Overthinking with Revised RIRM

In the reinforcement learning (RL) stage, the model employs a refined Reflection Inhibition Reward Mechanism (RIRM) to prevent excessively long reasoning chains for simple tasks.

The reward for reflection, $R_{ver}$, is calculated using a threshold-based penalty system:

  • rmin=0: The ideal number of reflection steps for direct responses.
  • rmax=3: The maximum tolerable reflection threshold.

For correct samples, the reward decreases as reflection steps approach rmax, while incorrect samples that ‘overthink’ (exceeding rmax receive maximum penalties. This mechanism resulted in a 16.33% gain in training accuracy and a 14.38% reduction in output token length.

https://github.com/Yuan-lab-LLM/Yuan3.0-Ultra/blob/main/Docs/Yuan3.0_Ultra%20Paper.pdf

Enterprise Benchmark Performance

Yuan3.0 Ultra was evaluated against several industry models, including GPT-5.2 and Gemini 3.1 Pro, across specialized enterprise benchmarks.

Benchmark Task Category Yuan3.0 Ultra Score Leading Competitor Score
Docmatix Multimodal RAG 67.4% 48.4% (GPT-5.2)
ChatRAG Text Retrieval (Avg) 68.2% 53.6% (Kimi K2.5)
MMTab Table Reasoning 62.3% 66.2% (Kimi K2.5)
SummEval Text Summarization 62.8% 49.9% (Claude Opus 4.6)
Spider 1.0 Text-to-SQL 83.9% 82.7% (Kimi K2.5)
BFCL V3 Tool Invocation 67.8% 78.8% (Gemini 3.1 Pro)

The results indicate that Yuan3.0 Ultra achieves state-of-the-art accuracy in multimodal retrieval (Docmatix) and long-context retrieval (ChatRAG) while maintaining robust performance in structured data processing and tool calling.


Check out the Paper and Repo. Also, feel free to follow us on Twitter and don’t forget to join our 120k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.


READ ALSO

How to Build an EverMem-Style Persistent AI Agent OS with Hierarchical Memory, FAISS Vector Retrieval, SQLite Storage, and Automated Memory Consolidation

LangWatch Open Sources the Missing Evaluation Layer for AI Agents to Enable End-to-End Tracing, Simulation, and Systematic Testing



Source_link

Related Posts

How to Build an EverMem-Style Persistent AI Agent OS with Hierarchical Memory, FAISS Vector Retrieval, SQLite Storage, and Automated Memory Consolidation
Al, Analytics and Automation

How to Build an EverMem-Style Persistent AI Agent OS with Hierarchical Memory, FAISS Vector Retrieval, SQLite Storage, and Automated Memory Consolidation

March 5, 2026
LangWatch Open Sources the Missing Evaluation Layer for AI Agents to Enable End-to-End Tracing, Simulation, and Systematic Testing
Al, Analytics and Automation

LangWatch Open Sources the Missing Evaluation Layer for AI Agents to Enable End-to-End Tracing, Simulation, and Systematic Testing

March 4, 2026
Luvr Chatbot Review: Key Features & Pricing
Al, Analytics and Automation

Luvr Chatbot Review: Key Features & Pricing

March 4, 2026
A “ChatGPT for spreadsheets” helps solve difficult engineering challenges faster | MIT News
Al, Analytics and Automation

A “ChatGPT for spreadsheets” helps solve difficult engineering challenges faster | MIT News

March 4, 2026
Meet SymTorch: A PyTorch Library that Translates Deep Learning Models into Human-Readable Equations
Al, Analytics and Automation

Meet SymTorch: A PyTorch Library that Translates Deep Learning Models into Human-Readable Equations

March 4, 2026
Luvr Image Generator Review: Features and Pricing Explained
Al, Analytics and Automation

Luvr Image Generator Review: Features and Pricing Explained

March 3, 2026
Next Post
Jensen Huang says Nvidia is pulling back from OpenAI and Anthropic, but his explanation raises more questions than it answers

Jensen Huang says Nvidia is pulling back from OpenAI and Anthropic, but his explanation raises more questions than it answers

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Google announced the next step in its nuclear energy plans 

Google announced the next step in its nuclear energy plans 

August 20, 2025

EDITOR'S PICK

Top 10 Audio Annotation Companies in 2026

Top 10 Audio Annotation Companies in 2026

November 10, 2025
Helping power-system planners prepare for an unknown future | MIT News

Helping power-system planners prepare for an unknown future | MIT News

December 6, 2025
The viral marketing class you wish you’d taken — in 3 minutes

The viral marketing class you wish you’d taken — in 3 minutes

June 18, 2025
10 Insider Threat Examples: Real Corporate Case Studies

10 Insider Threat Examples: Real Corporate Case Studies

August 31, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Best Gravel Running Shoes (2026): Salomon, Adidas, Nike
  • Taste Spring Early with Red Bull® Limited-Edition Cherry Sakura
  • Insider One RCS Messaging: Branded, Rich, and Interactive
  • Seven tech giants signed Trump’s pledge to keep electricity costs from spiking around data centers 
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions