• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Thursday, November 13, 2025
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

DeepAgent: A Deep Reasoning AI Agent that Performs Autonomous Thinking, Tool Discovery, and Action Execution within a Single Reasoning Process

Josh by Josh
November 2, 2025
in Al, Analytics and Automation
0
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter


Most agent frameworks still run a predefined Reason, Act, Observe loop, so the agent can only use the tools that are injected in the prompt. This works for small tasks, but it fails when the toolset is large, when the task is long, and when the agent must change strategy in the middle of reasoning. The team from Renmin University of China and Xiaohongshu proposes DeepAgent as an end to end deep reasoning agent that keeps all of this inside one coherent reasoning process.

https://arxiv.org/pdf/2510.21618

Unified Reasoning With On Demand Tool Discovery

DeepAgent lets the model output four action types directly in text, internal thought, tool search, tool call, and memory fold. When the agent decides to search, it queries a dense index that contains tool descriptions from large registries, for example 16,000 plus RapidAPI tools and 3,912 ToolHop tools, then it receives only the top ranked tools back in context. This makes tool access dynamic, the model does not depend on a front loaded tool list, and it stays aligned with real environments where tools change.

Autonomous Memory Folding for Long Horizon Tasks

Long sequences of tool calls, web results, and code responses will overflow the context. DeepAgent solves this with an autonomous memory folding step. When the model emits the fold token, an auxiliary LLM compresses the full history into three memories, Episodic Memory that records task events, Working Memory that records the current sub goal and recent issues, and Tool Memory that records tool names, arguments, and outcomes. These memories are fed back as structured text, so the agent continues from a compact but information rich state.

ToolPO, Reinforcement Learning for Tool Use

Supervised traces do not teach robust tool use, because correct tool calls are only a few tokens inside a long generation. The research team introduce Tool Policy Optimization, ToolPO, to fix this. ToolPO runs rollouts on LLM simulated APIs, so training is stable and cheap, then it attributes reward to the exact tool call tokens, this is tool call advantage attribution, and it trains with a clipped PPO style objective. This is how the agent learns not only to call tools, but also to decide when to search and when to fold memory.

https://arxiv.org/pdf/2510.21618

Benchmarks, Labeled Tools vs Open Set Tools

The research team evaluates on 5 general tool use benchmarks, ToolBench, API Bank, TMDB, Spotify, ToolHop, and on 4 downstream tasks, ALFWorld, WebShop, GAIA, HLE. In the labeled tool setting, where every method is given the exact tools it needs, DeepAgent 32B RL with a QwQ 32B backbone reports 69.0 on ToolBench, 75.3 on API Bank, 89.0 on TMDB, 75.4 on Spotify, and 51.3 on ToolHop, which is the strongest 32B level result across all 5 datasets. Workflow baselines such as ReAct and CodeAct can match single datasets, for example ReAct with strong models is high on TMDB and Spotify, but none of them stay high on all 5, so the fair summary is that DeepAgent is more uniform, not that others are always low.

In the open set retrieval setting, which is the realistic one, DeepAgent must first find tools and then call them. Here DeepAgent 32B RL reaches 64.0 on ToolBench and 40.6 on ToolHop, while the strongest workflow baselines reach 55.0 on ToolBench and 36.2 on ToolHop, so the end to end agent still holds the lead. The research team also shows that autonomous tool retrieval itself lifts workflow agents, but DeepAgent gains more, which confirms that the architecture and the training are matched to large toolsets.

https://arxiv.org/pdf/2510.21618

Downstream Environments

On ALFWorld, WebShop, GAIA, and HLE, all under a 32B reasoning model, DeepAgent reports 91.8 percent success on ALFWorld, 34.4 percent success and 56.3 score on WebShop, 53.3 on GAIA, and a higher score than workflow agents on HLE. These tasks are longer and noisier, so the combination of memory folding and ToolPO is the likely source of the gap.

Key Takeaways

  1. DeepAgent keeps the whole agent loop inside one reasoning stream, the model can think, search tools, call them, and continue, so it is not limited to a fixed ReAct style workflow.
  2. It uses dense retrieval over large tool registries, 16,000 plus RapidAPI tools and about 3,900 ToolHop tools, so tools do not have to be pre listed in the prompt, they are discovered on demand.
  3. The autonomous memory folding module compresses long interaction histories into episodic, working, and tool memories, which prevents context overflow and keeps long horizon reasoning stable.
  4. Tool Policy Optimization, ToolPO, trains tool use end to end with simulated APIs and token level advantage attribution, so the agent learns to issue correct tool calls, not only to reach the final answer.
  5. On 5 tool benchmarks and 4 downstream tasks, DeepAgent at 32B scale is more consistent than workflow baselines in both labeled tool and open set settings, especially on ToolBench and ToolHop where tool discovery matters most.
https://arxiv.org/pdf/2510.21618

DeepAgent is a practical step toward agent architectures that do not depend on fixed tool prompts, because it unifies autonomous thinking, dense tool retrieval over 16,000 plus RapidAPIs and 3,900 plus ToolHop tools, structured tool calling, and memory folding in one loop. The use of LLM simulated APIs in ToolPO is an engineering choice, but it solves the latency and instability problem that hurts prior tool agents. The evaluation shows consistent 32B level gains in both labeled tool and open set settings, not isolated peaks. This release makes large toolspaces actually usable for LLM agents. Overall, DeepAgent confirms that end to end tool agents with memory and RL are emerging as the default pattern.


Check out the Paper and GitHub Repo. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.


Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.

🙌 Follow MARKTECHPOST: Add us as a preferred source on Google.



Source_link

READ ALSO

Talk to Your TV — Bitmovin’s Agentic AI Hub Quietly Redefines How We Watch

How to Build a Fully Functional Custom GPT-style Conversational AI Locally Using Hugging Face Transformers

Related Posts

Talk to Your TV — Bitmovin’s Agentic AI Hub Quietly Redefines How We Watch
Al, Analytics and Automation

Talk to Your TV — Bitmovin’s Agentic AI Hub Quietly Redefines How We Watch

November 13, 2025
How to Build a Fully Functional Custom GPT-style Conversational AI Locally Using Hugging Face Transformers
Al, Analytics and Automation

How to Build a Fully Functional Custom GPT-style Conversational AI Locally Using Hugging Face Transformers

November 13, 2025
Datasets for Training a Language Model
Al, Analytics and Automation

Datasets for Training a Language Model

November 13, 2025
PR Newswire via Morningstar PR Newswire Introduces AI-Led Platform Redefining the Future of Public Relations
Al, Analytics and Automation

PR Newswire via Morningstar PR Newswire Introduces AI-Led Platform Redefining the Future of Public Relations

November 12, 2025
How to Build an End-to-End Interactive Analytics Dashboard Using PyGWalker Features for Insightful Data Exploration
Al, Analytics and Automation

How to Build an End-to-End Interactive Analytics Dashboard Using PyGWalker Features for Insightful Data Exploration

November 12, 2025
The AI Image Model That Could Redefine Visual Creativity
Al, Analytics and Automation

The AI Image Model That Could Redefine Visual Creativity

November 12, 2025
Next Post
Large reasoning models almost certainly can think

Large reasoning models almost certainly can think

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR NEWS

Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
7 Best EOR Platforms for Software Companies in 2025

7 Best EOR Platforms for Software Companies in 2025

June 21, 2025

EDITOR'S PICK

The Best London Printing Service

The Best London Printing Service

May 31, 2025
Top 5 Medical Image Annotation Tools

Top 5 Medical Image Annotation Tools

August 27, 2025
Debbie Wasserman Schultz Remains Divisive Figure

Debbie Wasserman Schultz Remains Divisive Figure

August 5, 2025
Post, Story, and Reels Dimensions

Post, Story, and Reels Dimensions

May 31, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • How to Compare Your AI Visibility Against Your Competitors
  • Patient Waitlist Management Software Development: key Features
  • TAG Bulletin: Q3 2025
  • How to Stand Out in the Promotions Tab With Gmail Annotations
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?