• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Saturday, April 25, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

SwiReasoning: Entropy-Driven Alternation of Latent and Explicit Chain-of-Thought for Reasoning LLMs

Josh by Josh
October 13, 2025
in Al, Analytics and Automation
0
SwiReasoning: Entropy-Driven Alternation of Latent and Explicit Chain-of-Thought for Reasoning LLMs


SwiReasoning is a decoding-time framework that lets a reasoning LLM decide when to think in latent space and when to write explicit chain-of-thought, using block-wise confidence estimated from entropy trends in next-token distributions. The method is training-free, model-agnostic, and targets Pareto-superior accuracy/efficiency trade-offs on mathematics and STEM benchmarks. Reported results show +1.5%–2.8% average accuracy improvements with unlimited tokens and +56%–79% average token-efficiency gains under constrained budgets; on AIME’24/’25, it reaches maximum reasoning accuracy earlier than standard CoT.

What SwiReasoning changes at inference time?

The controller monitors the decoder’s next-token entropy to form a block-wise confidence signal. When confidence is low (entropy trending upward), it enters latent reasoning—the model continues to reason without emitting tokens. When confidence recovers (entropy trending down), it switches back to explicit reasoning, emitting CoT tokens to consolidate and commit to a single path. A switch count control limits the maximum number of thinking-block transitions to suppress overthinking before finalizing the answer. This dynamic alternation is the core mechanism behind the reported accuracy-per-token gains.

https://arxiv.org/pdf/2510.05069

Results: accuracy and efficiency on standard suites

It reports improvements across mathematics and STEM reasoning tasks:

  • Pass@1 (unlimited budget): accuracy lifts up to +2.8% (math) and +2.0% (STEM) in Figure 1 and Table 1, with a +2.17% average over baselines (CoT with sampling, CoT greedy, and Soft Thinking).
  • Token efficiency (limited budgets): average improvements up to +79% (Figure 2). A comprehensive comparison shows SwiReasoning attains the highest token efficiency in 13/15 evaluations, with an +84% average improvement over CoT across those settings (Figure 4).
  • Pass@k dynamics: with Qwen3-8B on AIME 2024/2025, maximum reasoning accuracies are achieved +50% earlier than CoT on average (Figure 5), indicating faster convergence to the ceiling with fewer sampled trajectories.

Why switching helps?

Explicit CoT is discrete and readable but locks in a single path prematurely, which can discard useful alternatives. Latent reasoning is continuous and information-dense per step, but purely latent strategies may diffuse probability mass and impede convergence. SwiReasoning adds a confidence-guided alternation: latent phases broaden exploration when the model is uncertain; explicit phases exploit rising confidence to solidify a solution and commit tokens only when beneficial. The switch count control regularizes the process by capping oscillations and limiting prolonged “silent” wandering—addressing both accuracy loss from diffusion and token waste from overthinking cited as challenges for training-free latent methods.

Positioning vs. baselines

The project compares against CoT with sampling, CoT greedy, and Soft Thinking, reporting a +2.17% average accuracy lift at unlimited budgets (Table 1) and consistent efficiency-per-token advantages under budget constraints. The visualized Pareto frontier shifts outward—either higher accuracy at the same budget or similar accuracy with fewer tokens—across different model families and scales. On AIME’24/’25, the Pass@k curves show that SwiReasoning reaches the performance ceiling with fewer samples than CoT, reflecting improved convergence behavior rather than only better raw ceilings.

https://arxiv.org/pdf/2510.05069
https://arxiv.org/pdf/2510.05069

Key Takeaways

  • Training-free controller: SwiReasoning alternates between latent reasoning and explicit chain-of-thought using block-wise confidence from next-token entropy trends.
  • Efficiency gains: Reports +56–79% average token-efficiency improvements under constrained budgets versus CoT, with larger gains as budgets tighten.
  • Accuracy lifts: Achieves +1.5–2.8% average Pass@1 improvements on mathematics/STEM benchmarks at unlimited budgets.
  • Faster convergence: On AIME 2024/2025, reaches maximum reasoning accuracy earlier than CoT (improved Pass@k dynamics).

SwiReasoning is a useful step toward pragmatic “reasoning policy” control at decode time: it’s training-free, slots behind the tokenizer, and exposes measurable gains on math/STEM suites by toggling between latent and explicit CoT using an entropy-trend confidence signal with a capped switch count. The open-source BSD implementation and clear flags (--max_switch_count, --alpha) make replication straightforward and lower the barrier to stacking with orthogonal efficiency layers (e.g., quantization, speculative decoding, KV-cache tricks). The method’s value proposition is “accuracy per token” rather than raw SOTA accuracy, which is operationally important for budgeted inference and batching.


Check out the Paper and Project Page. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.


Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.

🙌 Follow MARKTECHPOST: Add us as a preferred source on Google.



Source_link

READ ALSO

Google DeepMind Introduces Vision Banana: An Instruction-Tuned Image Generator That Beats SAM 3 on Segmentation and Depth Anything V3 on Metric Depth Estimation

MIT scientists build the world’s largest collection of Olympiad-level math problems, and open it to everyone | MIT News

Related Posts

Google DeepMind Introduces Vision Banana: An Instruction-Tuned Image Generator That Beats SAM 3 on Segmentation and Depth Anything V3 on Metric Depth Estimation
Al, Analytics and Automation

Google DeepMind Introduces Vision Banana: An Instruction-Tuned Image Generator That Beats SAM 3 on Segmentation and Depth Anything V3 on Metric Depth Estimation

April 25, 2026
MIT scientists build the world’s largest collection of Olympiad-level math problems, and open it to everyone | MIT News
Al, Analytics and Automation

MIT scientists build the world’s largest collection of Olympiad-level math problems, and open it to everyone | MIT News

April 24, 2026
Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates
Al, Analytics and Automation

Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Training Architecture Achieving 88% Goodput Under High Hardware Failure Rates

April 24, 2026
Mend Releases AI Security Governance Framework: Covering Asset Inventory, Risk Tiering, AI Supply Chain Security, and Maturity Model
Al, Analytics and Automation

Mend Releases AI Security Governance Framework: Covering Asset Inventory, Risk Tiering, AI Supply Chain Security, and Maturity Model

April 24, 2026
“Your Next Coworker May Not Be Human” as Google Bets Everything on AI Agents to Power the Office
Al, Analytics and Automation

“Your Next Coworker May Not Be Human” as Google Bets Everything on AI Agents to Power the Office

April 23, 2026
Google Cloud AI Research Introduces ReasoningBank: A Memory Framework that Distills Reasoning Strategies from Agent Successes and Failures
Al, Analytics and Automation

Google Cloud AI Research Introduces ReasoningBank: A Memory Framework that Distills Reasoning Strategies from Agent Successes and Failures

April 23, 2026
Next Post

What Happens If I Stop Posting on Pinterest?

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

5 Best Process Mining Software for 2026 I Evaluated

5 Best Process Mining Software for 2026 I Evaluated

January 10, 2026
Enhance Customer Retention and Revenue in Your QSR App with AppsPrize May 2025 (Updated)

Enhance Customer Retention and Revenue in Your QSR App with AppsPrize May 2025 (Updated)

May 31, 2025
The team behind continuous batching says your idle GPUs should be running inference, not sitting dark

The team behind continuous batching says your idle GPUs should be running inference, not sitting dark

March 12, 2026
Grow a Garden Corpse Flower Wiki

Grow a Garden Corpse Flower Wiki

October 4, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • THE ACCOUNTING & FINANCE SOFTWARE AI VISIBILITY INDEX 2026
  • CVSS scored these two Palo Alto CVEs as manageable. Chained, they gave attackers root access to 13,000 devices.
  • Which Is the Best Business Process Simulation Tool for a Mid-Size Manufacturing Company?
  • 7 highlights and announcements from Google Cloud Next ‘26
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions