• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Friday, March 27, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Technology And Software

TTT-Discover optimizes GPU kernels 2x faster than human experts — by training during inference

Josh by Josh
February 8, 2026
in Technology And Software
0
TTT-Discover optimizes GPU kernels 2x faster than human experts — by training during inference



Researchers from Stanford, Nvidia, and Together AI have developed a new technique that can discover new solutions to very complex problems. For example, they managed to optimize a critical GPU kernel to run 2x faster than the previous state-of-the-art written by human experts.

READ ALSO

The Supreme Court is scared it’s going to break the internet

Google Gemini now lets you import your chats and data from other AI apps

Their technique, called “Test-Time Training to Discover” (TTT-Discover), challenges the current paradigm of letting models “think longer” for reasoning problems. TTT-Discover allows the model to continue training during the inference process and update its weights for the problem at hand.

The limits of 'frozen' reasoning

Current enterprise AI strategies often rely on "frozen" models. Whether you use a closed or open reasoning model, the model's parameters are static. When you prompt these models, they search for answers within the fixed manifold of their training data. This works well for problems that resemble what the model has seen before.

However, true discovery problems, like inventing a novel algorithm or proving a new mathematical theorem, are, by definition, out-of-distribution. If the solution requires a leap of logic that doesn't exist in the training set, a frozen model will likely fail, no matter how much compute you throw at it during inference.

In comments to VentureBeat, Mert Yuksekgonul, a co-author of the paper and doctorate student at Stanford, illustrated this distinction using a famous mathematical breakthrough:

"I believe that thinking models wouldn't be able to prove, for example, P != NP, without test-time training, just like Andrew Wiles wouldn't be able to prove Fermat's Last Theorem without the 7 years he spent pursuing this single problem in isolation and continuously learning from his own failures."

TTT-Discover treats the test problem not as a query to be answered, but as an environment to be mastered. As the model attempts to solve the problem, it generates different types of data: failures, partial successes, and errors. Instead of discarding this data, TTT-Discover uses it to update the model's weights in real-time, effectively allowing the model to laser focus on that specific challenge as opposed to developing a very general problem-solving framework.

A different approach to reinforcement learning

TTT-Discover provides a fundamental shift on how reasoning models are trained. In standard reinforcement learning (RL) training, the goal is a generalist policy that performs well on average across many tasks. In TTT-Discover, the goal is to find the best solution to a very specific problem, and the policy is “a means towards this end,” according to the authors. Once the model discovers the artifact (i.e., the optimized code, the proof, or the molecule) the neural network that produced it can be discarded. 

To achieve this, the researchers engineered two specific components that differentiate TTT-Discover from standard reinforcement learning:

  1. Entropic objective: Standard RL optimizes for the average expected reward. If a model tries a risky path and fails, standard RL punishes it. TTT-Discover flips this. It uses an "entropic objective" that exponentially weighs high-reward outcomes. This forces the model to ignore "safe," average answers and aggressively hunt for "eureka" outliers, solutions that have a low probability of being found but offer a massive reward.

  2. PUCT search: The system introduces PUCT, a tree-search algorithm inspired by AlphaZero. It explores different solution paths, building a dataset of attempts. The model then trains on this dataset in real-time, learning to recognize which partial steps lead to high-reward outcomes.

Crucially, this method works best on problems with a continuous reward signal. The system needs a way to measure incremental progress such as "runtime in microseconds" or "error rate" rather than a binary "pass/fail" signal. This allows the model to follow the gradual improvement toward the optimal solution.

The economics of 'heavy inference'

For enterprises accustomed to paying fractions of a cent per API call, the cost profile of TTT-Discover requires a mindset shift. In their experiments, the researchers reported that a single discovery run involves approximately 50 training steps and thousands of rollouts, costing roughly $500 per problem.

TTT-Discover could be for “static, high-value assets” as opposed to trivial and recurring problems that can be solved with existing models and approaches.

Consider a cloud-native enterprise running a data pipeline that processes petabytes of information nightly. If that pipeline relies on a specific SQL query or GPU kernel, optimizing that code by just 1% could save hundreds of thousands of dollars in annual compute costs. In this context, spending $500 to find a kernel that is 50% faster is a trivial expense with an immediate ROI.

"This makes the most sense for low-frequency, high-impact decisions where a single improvement is worth far more than the compute cost," Yuksekgonul said. "Supply chain routing, drug design, and material discovery qualify. In these settings, spending hundreds of dollars on a single discovery step can easily pay for itself."

Implementation considerations

One of the most significant findings for enterprise adoption is that TTT-Discover does not require a proprietary frontier model. The researchers achieved state-of-the-art results using gpt-oss-120b, OpenAI’s open-weights model. The researchers have released the code for TTT-Discover to enable researchers and developers to use it for their own models.

Because the technique works with open models, companies can run this "discovery loop" entirely within their own secure VPCs or on-premise H100 clusters without sending their proprietary data to third-party servers.

“If a company already runs reinforcement learning, there is no additional infrastructure required,” Yuksekgonul said. “TTT-Discover uses the same training stack (GPUs, rollout workers, optimizers, checkpointing).” 

If they don’t already run RL, they would need to build that infrastructure. But enterprises can also use existing solutions to reduce the complexity of the process. The researchers orchestrated these training runs using the Tinker API by Thinking Machines, an API that manages the complexity of distributed training and inference.

“Tooling such as Tinker (and open variants, e.g., OpenTinker) lowers the setup cost, and both labor and compute costs are likely to drop over time,” he said.

Real-world use cases

The researchers deployed TTT-Discover across four distinct technical domains: systems engineering, algorithm design, biology, and mathematics. In almost every instance, the method set a new state-of-the-art.

In one experiment, the model optimized GPU kernels for matrix multiplication (including the "TriMul" kernel used in AlphaFold), achieving execution speeds up to 2x faster than prior state-of-the-art and outperforming the best human-written kernels on the leaderboard.

In competitive programming scenarios (AtCoder), it solved complex heuristic problems (e.g., optimizing geometric constraints for fishing nets) better than top human experts and prior AI baselines.

For the enterprise, the transition from these academic benchmarks to business value hinges on one specific constraint: the existence of a verifiable, scalar signal. Unlike a chatbot that generates text, TTT-Discover needs a hard metric (e.g., runtime, error rate, or profit margin) to optimize against.

Yuksekgonul said that this requirement draws a clear line between where this technology should and shouldn't be used. "At the moment, the key requirement is a reliable scalar signal of progress — cost, error, molecular properties — that the system can optimize against," he said.

This directs enterprise adoption toward "hard" engineering and operations challenges such as logistics, supply chain, and resource management, where problems like fleet routing or crew scheduling often rely on static heuristics. TTT-Discover can treat these as optimization environments, spending hours to find a route structure that shaves 5% off daily fuel costs.

The requirement for clear verifiers rules out qualitative tasks like "write a better marketing strategy," where verification is subjective and prone to noise.

"Hard to verify problems are still an open question,” Yuksekgonul said.

With current technology, the best path forward is to try to design verifiers, but “making those verifiers robust and hard to game is challenging, and we don’t have a good solution yet," he added.

From inference to invention

The broader implication is that enterprise AI stacks may need to evolve to support this kind of per-problem learning.

“Systems built around a frozen model will need to support per-problem (or per-domain) adaptation, and enterprises will need better problem specifications and internal feedback signals to make test-time learning effective,” Yuksekgonul said. “If training runs inside a private VPC, the training loop can also be integrated with more of the company’s internal environment, not just a central lab pipeline.”

For the enterprise, the value lies in identifying "million-dollar problems,” optimization challenges where a verifiable metric exists, but human progress has stalled. These are the candidates for TTT-Discover. By accepting higher latency and cost for specific queries, enterprises can turn their inference compute into an automated R&D lab, discovering solutions that were previously out of reach for both humans and frozen AI models.



Source_link

Related Posts

The Supreme Court is scared it’s going to break the internet
Technology And Software

The Supreme Court is scared it’s going to break the internet

March 27, 2026
Google Gemini now lets you import your chats and data from other AI apps
Technology And Software

Google Gemini now lets you import your chats and data from other AI apps

March 27, 2026
Anthropic Supply-Chain-Risk Designation Halted by Judge
Technology And Software

Anthropic Supply-Chain-Risk Designation Halted by Judge

March 27, 2026
Wikipedia cracks down on the use of AI in article writing
Technology And Software

Wikipedia cracks down on the use of AI in article writing

March 26, 2026
Mistral AI just released a text-to-speech model it says beats ElevenLabs — and it's giving away the weights for free
Technology And Software

Mistral AI just released a text-to-speech model it says beats ElevenLabs — and it's giving away the weights for free

March 26, 2026
Will AI replace your job? 4 reasons it might not.
Technology And Software

Will AI replace your job? 4 reasons it might not.

March 26, 2026
Next Post
YouTube Music starts putting lyrics behind a paywall

YouTube Music starts putting lyrics behind a paywall

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

How To Choose the Right Contract

How To Choose the Right Contract

September 1, 2025
AI in Mobile App Development: Turning Traditional Apps into Intelligent Solutions

AI in Mobile App Development: Turning Traditional Apps into Intelligent Solutions

November 12, 2025
From shiny object to sober reality: The vector database story, two years later

From shiny object to sober reality: The vector database story, two years later

November 17, 2025
Beyond PR – S06E02 – Gurpreet Lail – Brookline PR

Beyond PR – S06E02 – Gurpreet Lail – Brookline PR

June 4, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Pinterest for Shopify: 12 Do’s and Don’ts That Drive Real Sales
  • The Supreme Court is scared it’s going to break the internet
  • What AI Writing Tools Get Wrong (And The Stack I Use Instead)
  • Google’s ‘live’ AI search assistant can handle conversations in dozens more languages
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions