• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Monday, February 9, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Technology And Software

TTT-Discover optimizes GPU kernels 2x faster than human experts — by training during inference

Josh by Josh
February 8, 2026
in Technology And Software
0
TTT-Discover optimizes GPU kernels 2x faster than human experts — by training during inference
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter



Researchers from Stanford, Nvidia, and Together AI have developed a new technique that can discover new solutions to very complex problems. For example, they managed to optimize a critical GPU kernel to run 2x faster than the previous state-of-the-art written by human experts.

READ ALSO

What to do when you regret a social media post, explained

Patriots vs. Seahawks time, where to watch and more

Their technique, called “Test-Time Training to Discover” (TTT-Discover), challenges the current paradigm of letting models “think longer” for reasoning problems. TTT-Discover allows the model to continue training during the inference process and update its weights for the problem at hand.

The limits of 'frozen' reasoning

Current enterprise AI strategies often rely on "frozen" models. Whether you use a closed or open reasoning model, the model's parameters are static. When you prompt these models, they search for answers within the fixed manifold of their training data. This works well for problems that resemble what the model has seen before.

However, true discovery problems, like inventing a novel algorithm or proving a new mathematical theorem, are, by definition, out-of-distribution. If the solution requires a leap of logic that doesn't exist in the training set, a frozen model will likely fail, no matter how much compute you throw at it during inference.

In comments to VentureBeat, Mert Yuksekgonul, a co-author of the paper and doctorate student at Stanford, illustrated this distinction using a famous mathematical breakthrough:

"I believe that thinking models wouldn't be able to prove, for example, P != NP, without test-time training, just like Andrew Wiles wouldn't be able to prove Fermat's Last Theorem without the 7 years he spent pursuing this single problem in isolation and continuously learning from his own failures."

TTT-Discover treats the test problem not as a query to be answered, but as an environment to be mastered. As the model attempts to solve the problem, it generates different types of data: failures, partial successes, and errors. Instead of discarding this data, TTT-Discover uses it to update the model's weights in real-time, effectively allowing the model to laser focus on that specific challenge as opposed to developing a very general problem-solving framework.

A different approach to reinforcement learning

TTT-Discover provides a fundamental shift on how reasoning models are trained. In standard reinforcement learning (RL) training, the goal is a generalist policy that performs well on average across many tasks. In TTT-Discover, the goal is to find the best solution to a very specific problem, and the policy is “a means towards this end,” according to the authors. Once the model discovers the artifact (i.e., the optimized code, the proof, or the molecule) the neural network that produced it can be discarded. 

To achieve this, the researchers engineered two specific components that differentiate TTT-Discover from standard reinforcement learning:

  1. Entropic objective: Standard RL optimizes for the average expected reward. If a model tries a risky path and fails, standard RL punishes it. TTT-Discover flips this. It uses an "entropic objective" that exponentially weighs high-reward outcomes. This forces the model to ignore "safe," average answers and aggressively hunt for "eureka" outliers, solutions that have a low probability of being found but offer a massive reward.

  2. PUCT search: The system introduces PUCT, a tree-search algorithm inspired by AlphaZero. It explores different solution paths, building a dataset of attempts. The model then trains on this dataset in real-time, learning to recognize which partial steps lead to high-reward outcomes.

Crucially, this method works best on problems with a continuous reward signal. The system needs a way to measure incremental progress such as "runtime in microseconds" or "error rate" rather than a binary "pass/fail" signal. This allows the model to follow the gradual improvement toward the optimal solution.

The economics of 'heavy inference'

For enterprises accustomed to paying fractions of a cent per API call, the cost profile of TTT-Discover requires a mindset shift. In their experiments, the researchers reported that a single discovery run involves approximately 50 training steps and thousands of rollouts, costing roughly $500 per problem.

TTT-Discover could be for “static, high-value assets” as opposed to trivial and recurring problems that can be solved with existing models and approaches.

Consider a cloud-native enterprise running a data pipeline that processes petabytes of information nightly. If that pipeline relies on a specific SQL query or GPU kernel, optimizing that code by just 1% could save hundreds of thousands of dollars in annual compute costs. In this context, spending $500 to find a kernel that is 50% faster is a trivial expense with an immediate ROI.

"This makes the most sense for low-frequency, high-impact decisions where a single improvement is worth far more than the compute cost," Yuksekgonul said. "Supply chain routing, drug design, and material discovery qualify. In these settings, spending hundreds of dollars on a single discovery step can easily pay for itself."

Implementation considerations

One of the most significant findings for enterprise adoption is that TTT-Discover does not require a proprietary frontier model. The researchers achieved state-of-the-art results using gpt-oss-120b, OpenAI’s open-weights model. The researchers have released the code for TTT-Discover to enable researchers and developers to use it for their own models.

Because the technique works with open models, companies can run this "discovery loop" entirely within their own secure VPCs or on-premise H100 clusters without sending their proprietary data to third-party servers.

“If a company already runs reinforcement learning, there is no additional infrastructure required,” Yuksekgonul said. “TTT-Discover uses the same training stack (GPUs, rollout workers, optimizers, checkpointing).” 

If they don’t already run RL, they would need to build that infrastructure. But enterprises can also use existing solutions to reduce the complexity of the process. The researchers orchestrated these training runs using the Tinker API by Thinking Machines, an API that manages the complexity of distributed training and inference.

“Tooling such as Tinker (and open variants, e.g., OpenTinker) lowers the setup cost, and both labor and compute costs are likely to drop over time,” he said.

Real-world use cases

The researchers deployed TTT-Discover across four distinct technical domains: systems engineering, algorithm design, biology, and mathematics. In almost every instance, the method set a new state-of-the-art.

In one experiment, the model optimized GPU kernels for matrix multiplication (including the "TriMul" kernel used in AlphaFold), achieving execution speeds up to 2x faster than prior state-of-the-art and outperforming the best human-written kernels on the leaderboard.

In competitive programming scenarios (AtCoder), it solved complex heuristic problems (e.g., optimizing geometric constraints for fishing nets) better than top human experts and prior AI baselines.

For the enterprise, the transition from these academic benchmarks to business value hinges on one specific constraint: the existence of a verifiable, scalar signal. Unlike a chatbot that generates text, TTT-Discover needs a hard metric (e.g., runtime, error rate, or profit margin) to optimize against.

Yuksekgonul said that this requirement draws a clear line between where this technology should and shouldn't be used. "At the moment, the key requirement is a reliable scalar signal of progress — cost, error, molecular properties — that the system can optimize against," he said.

This directs enterprise adoption toward "hard" engineering and operations challenges such as logistics, supply chain, and resource management, where problems like fleet routing or crew scheduling often rely on static heuristics. TTT-Discover can treat these as optimization environments, spending hours to find a route structure that shaves 5% off daily fuel costs.

The requirement for clear verifiers rules out qualitative tasks like "write a better marketing strategy," where verification is subjective and prone to noise.

"Hard to verify problems are still an open question,” Yuksekgonul said.

With current technology, the best path forward is to try to design verifiers, but “making those verifiers robust and hard to game is challenging, and we don’t have a good solution yet," he added.

From inference to invention

The broader implication is that enterprise AI stacks may need to evolve to support this kind of per-problem learning.

“Systems built around a frozen model will need to support per-problem (or per-domain) adaptation, and enterprises will need better problem specifications and internal feedback signals to make test-time learning effective,” Yuksekgonul said. “If training runs inside a private VPC, the training loop can also be integrated with more of the company’s internal environment, not just a central lab pipeline.”

For the enterprise, the value lies in identifying "million-dollar problems,” optimization challenges where a verifiable metric exists, but human progress has stalled. These are the candidates for TTT-Discover. By accepting higher latency and cost for specific queries, enterprises can turn their inference compute into an automated R&D lab, discovering solutions that were previously out of reach for both humans and frozen AI models.



Source_link

Related Posts

What to do when you regret a social media post, explained
Technology And Software

What to do when you regret a social media post, explained

February 8, 2026
Patriots vs. Seahawks time, where to watch and more
Technology And Software

Patriots vs. Seahawks time, where to watch and more

February 8, 2026
Target Darts Omni Auto Scoring System Hits the Mark
Technology And Software

Target Darts Omni Auto Scoring System Hits the Mark

February 8, 2026
India has changed its startup rules for deep tech
Technology And Software

India has changed its startup rules for deep tech

February 8, 2026
What the OpenClaw moment means for enterprises: 5 big takeaways
Technology And Software

What the OpenClaw moment means for enterprises: 5 big takeaways

February 7, 2026
Moltbook is ChatGPT moment for AI agents
Technology And Software

Moltbook is ChatGPT moment for AI agents

February 7, 2026

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Google announced the next step in its nuclear energy plans 

Google announced the next step in its nuclear energy plans 

August 20, 2025

EDITOR'S PICK

MIT scientists debut a generative AI model that could create molecules addressing hard-to-treat diseases | MIT News

MIT scientists debut a generative AI model that could create molecules addressing hard-to-treat diseases | MIT News

November 27, 2025
Meta’s Breakdown Effect: What Advertisers Need to Know

Meta’s Breakdown Effect: What Advertisers Need to Know

June 14, 2025
I Have 22K Followers on LinkedIn — Here’s How You Can Grow Your Following

I Have 22K Followers on LinkedIn — Here’s How You Can Grow Your Following

February 2, 2026
The initial reactions to OpenAI’s landmark open source gpt-oss models are highly varied and mixed

The initial reactions to OpenAI’s landmark open source gpt-oss models are highly varied and mixed

August 6, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • TTT-Discover optimizes GPU kernels 2x faster than human experts — by training during inference
  • Erogen AI Image Generator Prices, Capabilities, and Feature Breakdown
  • GM Financial’s business case for making AI fun for employees
  • What Andromeda Wants – Jon Loomer Digital
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?