• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Wednesday, March 11, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

DeepAgent: A Deep Reasoning AI Agent that Performs Autonomous Thinking, Tool Discovery, and Action Execution within a Single Reasoning Process

Josh by Josh
November 2, 2025
in Al, Analytics and Automation
0


Most agent frameworks still run a predefined Reason, Act, Observe loop, so the agent can only use the tools that are injected in the prompt. This works for small tasks, but it fails when the toolset is large, when the task is long, and when the agent must change strategy in the middle of reasoning. The team from Renmin University of China and Xiaohongshu proposes DeepAgent as an end to end deep reasoning agent that keeps all of this inside one coherent reasoning process.

https://arxiv.org/pdf/2510.21618

Unified Reasoning With On Demand Tool Discovery

DeepAgent lets the model output four action types directly in text, internal thought, tool search, tool call, and memory fold. When the agent decides to search, it queries a dense index that contains tool descriptions from large registries, for example 16,000 plus RapidAPI tools and 3,912 ToolHop tools, then it receives only the top ranked tools back in context. This makes tool access dynamic, the model does not depend on a front loaded tool list, and it stays aligned with real environments where tools change.

Autonomous Memory Folding for Long Horizon Tasks

Long sequences of tool calls, web results, and code responses will overflow the context. DeepAgent solves this with an autonomous memory folding step. When the model emits the fold token, an auxiliary LLM compresses the full history into three memories, Episodic Memory that records task events, Working Memory that records the current sub goal and recent issues, and Tool Memory that records tool names, arguments, and outcomes. These memories are fed back as structured text, so the agent continues from a compact but information rich state.

ToolPO, Reinforcement Learning for Tool Use

Supervised traces do not teach robust tool use, because correct tool calls are only a few tokens inside a long generation. The research team introduce Tool Policy Optimization, ToolPO, to fix this. ToolPO runs rollouts on LLM simulated APIs, so training is stable and cheap, then it attributes reward to the exact tool call tokens, this is tool call advantage attribution, and it trains with a clipped PPO style objective. This is how the agent learns not only to call tools, but also to decide when to search and when to fold memory.

https://arxiv.org/pdf/2510.21618

Benchmarks, Labeled Tools vs Open Set Tools

The research team evaluates on 5 general tool use benchmarks, ToolBench, API Bank, TMDB, Spotify, ToolHop, and on 4 downstream tasks, ALFWorld, WebShop, GAIA, HLE. In the labeled tool setting, where every method is given the exact tools it needs, DeepAgent 32B RL with a QwQ 32B backbone reports 69.0 on ToolBench, 75.3 on API Bank, 89.0 on TMDB, 75.4 on Spotify, and 51.3 on ToolHop, which is the strongest 32B level result across all 5 datasets. Workflow baselines such as ReAct and CodeAct can match single datasets, for example ReAct with strong models is high on TMDB and Spotify, but none of them stay high on all 5, so the fair summary is that DeepAgent is more uniform, not that others are always low.

In the open set retrieval setting, which is the realistic one, DeepAgent must first find tools and then call them. Here DeepAgent 32B RL reaches 64.0 on ToolBench and 40.6 on ToolHop, while the strongest workflow baselines reach 55.0 on ToolBench and 36.2 on ToolHop, so the end to end agent still holds the lead. The research team also shows that autonomous tool retrieval itself lifts workflow agents, but DeepAgent gains more, which confirms that the architecture and the training are matched to large toolsets.

https://arxiv.org/pdf/2510.21618

Downstream Environments

On ALFWorld, WebShop, GAIA, and HLE, all under a 32B reasoning model, DeepAgent reports 91.8 percent success on ALFWorld, 34.4 percent success and 56.3 score on WebShop, 53.3 on GAIA, and a higher score than workflow agents on HLE. These tasks are longer and noisier, so the combination of memory folding and ToolPO is the likely source of the gap.

Key Takeaways

  1. DeepAgent keeps the whole agent loop inside one reasoning stream, the model can think, search tools, call them, and continue, so it is not limited to a fixed ReAct style workflow.
  2. It uses dense retrieval over large tool registries, 16,000 plus RapidAPI tools and about 3,900 ToolHop tools, so tools do not have to be pre listed in the prompt, they are discovered on demand.
  3. The autonomous memory folding module compresses long interaction histories into episodic, working, and tool memories, which prevents context overflow and keeps long horizon reasoning stable.
  4. Tool Policy Optimization, ToolPO, trains tool use end to end with simulated APIs and token level advantage attribution, so the agent learns to issue correct tool calls, not only to reach the final answer.
  5. On 5 tool benchmarks and 4 downstream tasks, DeepAgent at 32B scale is more consistent than workflow baselines in both labeled tool and open set settings, especially on ToolBench and ToolHop where tool discovery matters most.
https://arxiv.org/pdf/2510.21618

DeepAgent is a practical step toward agent architectures that do not depend on fixed tool prompts, because it unifies autonomous thinking, dense tool retrieval over 16,000 plus RapidAPIs and 3,900 plus ToolHop tools, structured tool calling, and memory folding in one loop. The use of LLM simulated APIs in ToolPO is an engineering choice, but it solves the latency and instability problem that hurts prior tool agents. The evaluation shows consistent 32B level gains in both labeled tool and open set settings, not isolated peaks. This release makes large toolspaces actually usable for LLM agents. Overall, DeepAgent confirms that end to end tool agents with memory and RL are emerging as the default pattern.


Check out the Paper and GitHub Repo. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.


Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.

🙌 Follow MARKTECHPOST: Add us as a preferred source on Google.



Source_link

READ ALSO

A better method for planning complex visual tasks | MIT News

Google AI Introduces Gemini Embedding 2: A Multimodal Embedding Model that Lets Your Bring Text, Images, Video, Audio, and Docs into the Embedding Space

Related Posts

A better method for planning complex visual tasks | MIT News
Al, Analytics and Automation

A better method for planning complex visual tasks | MIT News

March 11, 2026
Google AI Introduces Gemini Embedding 2: A Multimodal Embedding Model that Lets Your Bring Text, Images, Video, Audio, and Docs into the Embedding Space
Al, Analytics and Automation

Google AI Introduces Gemini Embedding 2: A Multimodal Embedding Model that Lets Your Bring Text, Images, Video, Audio, and Docs into the Embedding Space

March 11, 2026
AI Is Learning From the News. Now Publishers Want to Get Paid
Al, Analytics and Automation

AI Is Learning From the News. Now Publishers Want to Get Paid

March 11, 2026
3 Questions: Building predictive models to characterize tumor progression | MIT News
Al, Analytics and Automation

3 Questions: Building predictive models to characterize tumor progression | MIT News

March 10, 2026
Al, Analytics and Automation

How to Build a Risk-Aware AI Agent with Internal Critic, Self-Consistency Reasoning, and Uncertainty Estimation for Reliable Decision-Making

March 10, 2026
marvn.ai and the rise of vertical AI search engines
Al, Analytics and Automation

marvn.ai and the rise of vertical AI search engines

March 10, 2026
Next Post
Large reasoning models almost certainly can think

Large reasoning models almost certainly can think

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Google announced the next step in its nuclear energy plans 

Google announced the next step in its nuclear energy plans 

August 20, 2025

EDITOR'S PICK

220+ Markers That Measure Experience, Expertise, Authority, and Trust

220+ Markers That Measure Experience, Expertise, Authority, and Trust

February 11, 2026
Top Mistakes to Avoid When Using Uncensored AI Video Generators

Top Mistakes to Avoid When Using Uncensored AI Video Generators

October 1, 2025
Google brings buy buttons to Gemini and AI search

Google brings buy buttons to Gemini and AI search

January 12, 2026
NVIDIA AI Releases Nemotron 3: A Hybrid Mamba Transformer MoE Stack for Long Context Agentic AI

NVIDIA AI Releases Nemotron 3: A Hybrid Mamba Transformer MoE Stack for Long Context Agentic AI

December 20, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • A better method for planning complex visual tasks | MIT News
  • When Clickbait Becomes a Lesson
  • Google completes acquisition of Wiz
  • 9 Best Free SEO Courses in 2026
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions