• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Friday, March 13, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

xAI launches Grok-4-Fast: Unified Reasoning and Non-Reasoning Model with 2M-Token Context and Trained End-to-End with Tool-Use Reinforcement Learning (RL)

Josh by Josh
September 20, 2025
in Al, Analytics and Automation
0
xAI launches Grok-4-Fast: Unified Reasoning and Non-Reasoning Model with 2M-Token Context and Trained End-to-End with Tool-Use Reinforcement Learning (RL)


xAI introduced Grok-4-Fast, a cost-optimized successor to Grok-4 that merges “reasoning” and “non-reasoning” behaviors into a single set of weights controllable via system prompts. The model targets high-throughput search, coding, and Q&A with a 2M-token context window and native tool-use RL that decides when to browse the web, execute code, or call tools.

Architecture note

Previous Grok releases split long-chain “reasoning” and short “non-reasoning” responses across separate models. Grok-4-Fast’s unified weight space reduces end-to-end latency and tokens by steering behavior via system prompts, which is relevant for real-time applications (search, assistive agents, and interactive coding) where switching models penalizes both latency and cost.

Search and agentic use

Grok-4-Fast was trained end-to-end with tool-use reinforcement learning and shows gains on search-centric agent benchmarks: BrowseComp 44.9%, SimpleQA 95.0%, Reka Research 66.0%, plus higher scores on Chinese variants (e.g., BrowseComp-zh 51.2%). xAI also cites private battle-testing on LMArena where grok-4-fast-search (codename “menlo”) ranks #1 in the Search Arena with 1163 Elo, and the text variant (codename “tahoe”) sits at #8 in the Text Arena, roughly on par with grok-4-0709.

Performance and efficiency deltas

On internal and public benchmarks, Grok-4-Fast posts frontier-class scores while cutting token usage. xAI reports pass@1 results of 92.0% (AIME 2025, no tools), 93.3% (HMMT 2025, no tools), 85.7% (GPQA Diamond), and 80.0% (LiveCodeBench Jan–May), approaching or matching Grok-4 but using ~40% fewer “thinking” tokens on average. The company frames this as “intelligence density,” claiming a ~98% reduction in price to reach the same benchmark performance as Grok-4 when the lower token count and new per-token pricing are combined.

Deployment and price

The model is generally available to all users in Grok’s Fast and Auto modes across web and mobile; Auto will select Grok-4-Fast for difficult queries to improve latency without losing quality, and—for the first time—free users access xAI’s latest model tier. For developers, xAI exposes two SKUs—grok-4-fast-reasoning and grok-4-fast-non-reasoning—both with 2M context. Pricing (xAI API) is $0.20 / 1M input tokens (<128k), $0.40 / 1M input tokens (≥128k), $0.50 / 1M output tokens (<128k), $1.00 / 1M output tokens (≥128k), and $0.05 / 1M cached input tokens.

https://x.ai/news/grok-4-fast

5 Technical Takeaways:

  • Unified model + 2M context. Grok-4-Fast uses a single weight space for “reasoning” and “non-reasoning,” prompt-steered, with a 2,000,000-token window across both SKUs.
  • Pricing for scale. API pricing starts at $0.20/M input, $0.50/M output, with cached input at $0.05/M and higher rates only beyond 128K context.
  • Efficiency claims. xAI reports ~40% fewer “thinking” tokens at comparable accuracy vs Grok-4, yielding a ~98% lower price to match Grok-4 performance on frontier benchmarks.
  • Benchmark profile. Reported pass@1: AIME-2025 92.0%, HMMT-2025 93.3%, GPQA-Diamond 85.7%, LiveCodeBench (Jan–May) 80.0%.
  • Agentic/search use. Post-training with tool-use RL; positioned for browsing/search workflows with documented search-agent metrics and live-search billing in docs.

Summary

Grok-4-Fast packages Grok-4-level capability into a single, prompt-steerable model with a 2M-token window, tool-use RL, and pricing tuned for high-throughput search and agent workloads. Early public signals (LMArena #1 in Search, competitive Text placement) align with xAI’s claim of similar accuracy using ~40% fewer “thinking” tokens, translating to lower latency and unit cost in production.


Check out the Technical details. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.


Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.

🔥[Recommended Read] NVIDIA AI Open-Sources ViPE (Video Pose Engine): A Powerful and Versatile 3D Video Annotation Tool for Spatial AI



Source_link

READ ALSO

Meta Unveils Four New Chips to Power Its AI and Recommendation Systems

New MIT class uses anthropology to improve chatbots | MIT News

Related Posts

Meta Unveils Four New Chips to Power Its AI and Recommendation Systems
Al, Analytics and Automation

Meta Unveils Four New Chips to Power Its AI and Recommendation Systems

March 12, 2026
New MIT class uses anthropology to improve chatbots | MIT News
Al, Analytics and Automation

New MIT class uses anthropology to improve chatbots | MIT News

March 12, 2026
How to Design a Streaming Decision Agent with Partial Reasoning, Online Replanning, and Reactive Mid-Execution Adaptation in Dynamic Environments
Al, Analytics and Automation

How to Design a Streaming Decision Agent with Partial Reasoning, Online Replanning, and Reactive Mid-Execution Adaptation in Dynamic Environments

March 12, 2026
3 Questions: On the future of AI and the mathematical and physical sciences | MIT News
Al, Analytics and Automation

3 Questions: On the future of AI and the mathematical and physical sciences | MIT News

March 12, 2026
NVIDIA Releases Nemotron 3 Super: A 120B Parameter Open-Source Hybrid Mamba-Attention MoE Model Delivering 5x Higher Throughput for Agentic AI
Al, Analytics and Automation

NVIDIA Releases Nemotron 3 Super: A 120B Parameter Open-Source Hybrid Mamba-Attention MoE Model Delivering 5x Higher Throughput for Agentic AI

March 11, 2026
A better method for planning complex visual tasks | MIT News
Al, Analytics and Automation

A better method for planning complex visual tasks | MIT News

March 11, 2026
Next Post
The practical 2025 guide for social marketers

The practical 2025 guide for social marketers

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Google announced the next step in its nuclear energy plans 

Google announced the next step in its nuclear energy plans 

August 20, 2025

EDITOR'S PICK

P2P Lending Software Development: Process, Costs & Features

P2P Lending Software Development: Process, Costs & Features

September 24, 2025
Eric McNeil: The Connector Turning Ambition Into Scalable Empires

Eric McNeil: The Connector Turning Ambition Into Scalable Empires

February 2, 2026
New control system teaches soft robots the art of staying safe | MIT News

New control system teaches soft robots the art of staying safe | MIT News

December 3, 2025
The New Corporate Communications Playbook

The New Corporate Communications Playbook

December 27, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • New AI features in Google Maps
  • The 11 Best Social Media Analytics Tools in 2026 — For Creators and Marketers
  • Meta Announces Location Fees (Plus 4 Updates)
  • Build In-House vs Hire Development Agency Guide 2026
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions