• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Saturday, January 24, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

Salesforce AI Research Introduces WALT (Web Agents that Learn Tools): Enabling LLM agents to Automatically Discover Reusable Tools from Any Website

Josh by Josh
October 24, 2025
in Al, Analytics and Automation
0
Salesforce AI Research Introduces WALT (Web Agents that Learn Tools): Enabling LLM agents to Automatically Discover Reusable Tools from Any Website
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter


A team of Salesforce AI researchers introduced WALT (Web Agents that Learn Tools), a framework that reverse-engineers latent website functionality into reusable invocable tools. It reframes browser automation around callable tools rather than long chains of clicks. Agents then call operations such as search, filter, sort, post_comment, and create_listing. This reduces dependence on large language model step by step reasoning and increases determinism during execution.

https://arxiv.org/pdf/2510.01524

What WALT builds?

Web agents often fail when layouts shift or when tasks require long sequences. WALT targets this failure mode by mining site functionality offline, then exposing it as tools that encapsulate navigation, selection, extraction, and optional agentic steps. Tools carry contracts in the form of schemas and examples. At runtime, an agent composes a short program with a few tool calls to complete a task. The design goal is higher success with fewer steps and less reliance on free form reasoning.

Pipeline in two phases

The pipeline has discovery and construction with validation. In discovery, WALT explores a website and proposes tool candidates that map to common goals such as discovery, content management, and communication. In construction and validation, WALT converts traces to deterministic scripts, stabilizes selectors, attempts URL promotion when possible, induces an input schema, and registers a tool only after end to end checks pass. This shifts as much work as possible into stable URL and form operations and leaves agentic grounding for the cases that truly require it.

https://arxiv.org/pdf/2510.01524

Results on VisualWebArena and WebArena

On VisualWebArena, WALT reports an average success rate of 52.9 percent with per split results of 64.1 percent on Classifieds, 53.4 percent on Shopping, and 39.0 percent on Reddit. The table lists baselines such as SGV at 50.2 percent and ExaCT at 33.7 percent. Human performance is 88.7 percent on average.

On WebArena, WALT reaches 50.1 percent average across GitLab, Map, Shopping, CMS, Reddit, and Multi. The table shows WALT ahead of prior methods with a nine point margin over the best skill induction baseline. Human performance is 78.2 percent.

https://arxiv.org/pdf/2510.01524

Efficiency and ablations

Tools reduce action count by a factor near 1.4 on average relative to a matched agent without tools. On the Classifieds split, ablations show consistent gains when tools are used across different agent backbones. WALT with GPT 5 mini records 7 percent higher success and 27 percent fewer steps, while a human demonstration strategy yields 66.0 percent success. The fully autonomous WALT reaches 64.1 percent with 5 percent fewer steps than the human demonstration case. Multimodal DOM parsing adds 2.6 percent absolute improvement. External verification adds 3.3 percent while increasing checks. Across components, WALT records 21.3 percent fewer steps than baseline policies.

https://arxiv.org/pdf/2510.01524

Design choices that enforce determinism

WALT prefers URL level operations when the site exposes query parameters or routes for search and filtering. When pages require dynamic grounding, the tool script inserts bounded agentic steps such as content extraction or wait for page load. Selector stabilization and schema validation reduce drift when sites change. The method keeps the fraction of agentic operations low in discovered tool sets and biases toward deterministic actions like navigation, input, and click.

Key Takeaways

  1. Approach: WALT discovers and validates website-native functions, then exposes them as callable tools with input schemas, selector stabilization, and URL promotion, reducing brittle step sequences to deterministic operations.
  2. Results — VisualWebArena: Average success rate 52.9%, with 64.1% on Classifieds, 53.4% on Shopping, and 39.0% on Reddit, outperforming several baselines reported in the paper.
  3. Results — WebArena: Average success rate 50.1% across GitLab, Map, Shopping, CMS, Reddit, and Multi, showing consistent gains over skill-induction and search-based baselines.
  4. Efficiency and Ablations: Toolization cuts steps by about 1.4x, with 21.3% fewer actions on average. Multimodal DOM parsing adds +2.6% absolute success, and external verification adds +3.3%.

WALT is a useful pivot from step sequence agents to functionality grounded tools. The framework reverse engineers latent website functionality into reusable invocable tools across discovery, content management, and communication. By promoting UI traces to deterministic tools with schema validation and URL operations, WALT lifts web agent success to 52.9 percent on VisualWebArena and 50.1 percent on WebArena, while cutting actions by about 21.3 percent. The release ships a CLI, walt discover, walt agent, and MCP serving for integration.


Check out the Paper and GitHub Page. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.


Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.

🙌 Follow MARKTECHPOST: Add us as a preferred source on Google.



Source_link

READ ALSO

Joi Chatbot Access, Pricing, and Feature Overview

Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control

Related Posts

Joi Chatbot Access, Pricing, and Feature Overview
Al, Analytics and Automation

Joi Chatbot Access, Pricing, and Feature Overview

January 23, 2026
Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control
Al, Analytics and Automation

Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control

January 23, 2026
Quality Data Annotation for Cardiovascular AI
Al, Analytics and Automation

Quality Data Annotation for Cardiovascular AI

January 23, 2026
A Missed Forecast, Frayed Nerves and a Long Trip Back
Al, Analytics and Automation

A Missed Forecast, Frayed Nerves and a Long Trip Back

January 23, 2026
Microsoft Releases VibeVoice-ASR: A Unified Speech-to-Text Model Designed to Handle 60-Minute Long-Form Audio in a Single Pass
Al, Analytics and Automation

Microsoft Releases VibeVoice-ASR: A Unified Speech-to-Text Model Designed to Handle 60-Minute Long-Form Audio in a Single Pass

January 23, 2026
Slow Down the Machines? Wall Street and Silicon Valley at Odds Over A.I.’s Nearest Future
Al, Analytics and Automation

Slow Down the Machines? Wall Street and Silicon Valley at Odds Over A.I.’s Nearest Future

January 22, 2026
Next Post
Rivian will pay $250M to settle lawsuit over R1 price hike

Rivian will pay $250M to settle lawsuit over R1 price hike

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Google announced the next step in its nuclear energy plans 

Google announced the next step in its nuclear energy plans 

August 20, 2025

EDITOR'S PICK

How Trivia and Leaderboards Make In-Person Events More Fun & Impactful

How Trivia and Leaderboards Make In-Person Events More Fun & Impactful

June 11, 2025
What Is a Competitive Analysis? (with Template + Examples)

What Is a Competitive Analysis? (with Template + Examples)

July 12, 2025
Turning Customer Insights into Marketing Action: 5 Takeaways

Turning Customer Insights into Marketing Action: 5 Takeaways

August 28, 2025
Synthetic Dataset Generation with Faker

Synthetic Dataset Generation with Faker

August 8, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • PR Drives Blockchain Adoption in Institutions
  • US Judge Rules ICE Raids Require Judicial Warrants, Contradicting Secret ICE Memo
  • TikTok Oracle Deal: What the New US Entity Means for Advertisers in 2026
  • If You’ve Been Investing in SEO, You’re on the Right Track With GEO: An AMA With Lily Ray
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?