• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Thursday, August 7, 2025
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Technology And Software

The initial reactions to OpenAI’s landmark open source gpt-oss models are highly varied and mixed

Josh by Josh
August 6, 2025
in Technology And Software
0
The initial reactions to OpenAI’s landmark open source gpt-oss models are highly varied and mixed
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now


OpenAI’s long-awaited return to the “open” of its namesake occurred yesterday with the release of two new large language models (LLMs): gpt-oss-120B and gpt-oss-20B.

But despite achieving technical benchmarks on par with OpenAI’s other powerful proprietary AI model offerings, the broader AI developer and user community’s initial response has so far been all over the map. If this release were a movie premiering and being graded on Rotten Tomatoes, we’d be looking at a near 50% split, based on my observations.

READ ALSO

The Browser Company launches a $20 monthly subscription for its AI-powered browser

Understanding Amazon Elastic Compute Cloud (EC2)

First some background: OpenAI has released these two new text-only language models (no image generation or analysis) both under the permissive open source Apache 2.0 license — the first time since 2019 (before ChatGPT) that the company has done so with a cutting-edge language model.

The entire ChatGPT era of the last 2.7 years has so far been powered by proprietary or closed-source models, ones that OpenAI controlled and that users had to pay to access (or use a free tier subject to limits), with limited customizability and no way to run them offline or on private computing hardware.


AI Scaling Hits Its Limits

Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:

  • Turning energy into a strategic advantage
  • Architecting efficient inference for real throughput gains
  • Unlocking competitive ROI with sustainable AI systems

Secure your spot to stay ahead: https://bit.ly/4mwGngO


But that all changed thanks to the release of the pair of gpt-oss models yesterday, one larger and more powerful for use on a single Nvidia H100 GPU at say, a small or medium-sized enterprise’s data center or server farm, and an even smaller one that works on a single consumer laptop or desktop PC like the kind in your home office.

Of course, the models being so new, it’s taken several hours for the AI power user community to independently run and test them out on their own individual benchmarks (measurements) and tasks.

And now we’re getting a wave of feedback ranging from optimistic enthusiasm about the potential of these powerful, free, and efficient new models to an undercurrent of dissatisfaction and dismay with what some users see as significant problems and limitations, especially compared to the wave of similarly Apache 2.0-licensed powerful open source, multimodal LLMs from Chinese startups (which can also be taken, customized, run locally on U.S. hardware for free by U.S. companies, or companies anywhere else around the world).

High benchmarks, but still behind Chinese open source leaders

Intelligence benchmarks place the gpt-oss models ahead of most American open-source offerings. According to independent third-party AI benchmarking firm Artificial Analysis, gpt-oss-120B is “the most intelligent American open weights model,” though it still falls short of Chinese heavyweights like DeepSeek R1 and Qwen3 235B.

“On reflection, that’s all they did. Mogged on benchmarks,” wrote self-proclaimed DeepSeek “stan” @teortaxesTex. “No good derivative models will be trained… No new usecases created… Barren claim to bragging rights.”

That skepticism is echoed by pseudonymous open source AI researcher Teknium (@Teknium1), co-founder of rival open source AI model provider Nous Research, who called the release “a legitimate nothing burger,” on X, and predicted a Chinese model will soon eclipse it. “Overall very disappointed and I legitimately came open minded to this,” they wrote.

Bench-maxxing on math and coding at the expense of writing?

Other criticism focused on the gpt-oss models’ apparent narrow usefulness.

AI influencer “Lisan al Gaib (@scaling01)” noted that the models excel at math and coding but “completely lack taste and common sense.” He added, “So it’s just a math model?”

In creative writing tests, some users found the model injecting equations into poetic outputs. “This is what happens when you benchmarkmax,” Teknium remarked, sharing a screenshot where the model added an integral formula mid-poem.

And @kalomaze, a researcher at decentralized AI model training company Prime Intellect, wrote that “gpt-oss-120b knows less about the world than what a good 32b does. probably wanted to avoid copyright issues so they likely pretrained on majority synth. pretty devastating stuff”

Former Googler and independent AI developer Kyle Corbitt agreed that the gpt-oss pair of models seemed to have been trained primarily on synthetic data — that is, data generated by an AI model specifically for the purposes of training a new one — making it “extremely spiky.”

It’s “great at the tasks it’s trained on, really bad at everything else,” Corbitt wrote, i.e., great on coding and math problems, and bad at more linguistic tasks like creative writing or report generation.

In other words, the charge is that OpenAI deliberately trained the model on more synthetic data than real world facts and figures to avoid using copyrighted data scraped from websites and other repositories it doesn’t own or have license to use, which is something it and many other leading gen AI companies have been accused of in the past and are facing down ongoing lawsuits as a result of.

Others speculated OpenAI may have trained the model on primarily synthetic data to avoid safety and security issues, resulting in worse quality than if it had been trained on more real world (and presumably copyrighted) data.

Concerning third-party benchmark results

Moreover, evaluating the models on third-party benchmarking tests have turned up concerning metrics in some users’ eyes.

SpeechMap — which measures the performance of LLMs in complying with user prompts to generate disallowed, biased, or politically sensitive outputs — showed compliance scores for gpt-oss 120B hovering under 40%, near the bottom of peer open models, which indicates resistance to follow user requests and defaulting to guardrails, potentially at the expense of providing accurate information.

In Aider’s Polyglot evaluation, gpt-oss-120B scored just 41.8% in multilingual reasoning—far below competitors like Kimi-K2 (59.1%) and DeepSeek-R1 (56.9%).

Some users also said their tests indicated the model is oddly resistant to generating criticism of China or Russia, a contrast to its treatment of the US and EU, raising questions about bias and training data filtering.

Other experts have applauded the release and what it signals for U.S. open source AI

To be fair, not all the commentary is negative. Software engineer and close AI watcher Simon Willison called the release “really impressive” on X, elaborating in a blog post on the models’ efficiency and ability to achieve parity with OpenAI’s proprietary o3-mini and o4-mini models.

He praised their strong performance on reasoning and STEM-heavy benchmarks, and hailed the new “Harmony” prompt template format — which offers developers more structured terms for guiding model responses — and support for third-party tool use as meaningful contributions.

In a lengthy X post, Clem Delangue, CEO and co-founder of AI code sharing and open source community Hugging Face, encouraged users not to rush to judgment, pointing out that inference for these models is complex, and early issues could be due to infrastructure instability and insufficient optimization among hosting providers.

“The power of open-source is that there’s no cheating,” Delangue wrote. “We’ll uncover all the strengths and limitations… progressively.”

Even more cautious was Wharton School of Business at the University of Pennsylvania professor Ethan Mollick, who wrote on X that “The US now likely has the leading open weights models (or close to it)”, but questioned whether this is a one-off by OpenAI. “The lead will evaporate quickly as others catch up,” he noted, adding that it’s unclear what incentives OpenAI has to keep the models updated.

Nathan Lambert, a leading AI researcher at the rival open source lab Allen Institute for AI (Ai2) and commentator, praised the symbolic significance of the release on his blog Interconnects, calling it “a phenomenal step for the open ecosystem, especially for the West and its allies, that the most known brand in the AI space has returned to openly releasing models.”

But he cautioned on X that gpt-oss is “unlikely to meaningfully slow down [Chinese e-commerce giant Aliaba’s AI team] Qwen,” citing its usability, performance, and variety.

He argued the release marks an important shift in the U.S. toward open models, but that OpenAI still has a “long path back” to catch up in practice.

A split verdict

The verdict, for now, is split.

OpenAI’s gpt-oss models are a landmark in terms of licensing and accessibility.

But while the benchmarks look solid, the real-world “vibes” — as many users describe it — are proving less compelling.

Whether developers can build strong applications and derivatives on top of gpt-oss will determine whether the release is remembered as a breakthrough or a blip.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.



Source_link

Related Posts

The Browser Company launches a $20 monthly subscription for its AI-powered browser
Technology And Software

The Browser Company launches a $20 monthly subscription for its AI-powered browser

August 7, 2025
Understanding Amazon Elastic Compute Cloud (EC2)
Technology And Software

Understanding Amazon Elastic Compute Cloud (EC2)

August 7, 2025
OpenAI is giving ChatGPT Enterprise to the executive branch workforce for $1
Technology And Software

OpenAI is giving ChatGPT Enterprise to the executive branch workforce for $1

August 6, 2025
The Tech That Keeps Planes Flying for Ultra-Long-Haul Flights
Technology And Software

The Tech That Keeps Planes Flying for Ultra-Long-Haul Flights

August 6, 2025
Clay confirms it closed $100M round at $3.1B valuation
Technology And Software

Clay confirms it closed $100M round at $3.1B valuation

August 6, 2025
Wireshark:The Top Choice for Network Traffic Analysis
Technology And Software

Wireshark:The Top Choice for Network Traffic Analysis

August 6, 2025
Next Post
Josh Amundson Is the Wizard Behind the Brands

Josh Amundson Is the Wizard Behind the Brands

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR NEWS

Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
7 Best EOR Platforms for Software Companies in 2025

7 Best EOR Platforms for Software Companies in 2025

June 21, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Top B2B & Marketing Podcasts to Lead You to Succeed in 2025 – TopRank® Marketing

Top B2B & Marketing Podcasts to Lead You to Succeed in 2025 – TopRank® Marketing

May 30, 2025

EDITOR'S PICK

Dunk City Dynasty launches Season 2 with Jayson Tatum and $10K community competition

Dunk City Dynasty launches Season 2 with Jayson Tatum and $10K community competition

June 18, 2025
How Pacific Content Hosted a Creative Executive Offsite in New Orleans

How Pacific Content Hosted a Creative Executive Offsite in New Orleans

July 16, 2025
Huda Kattan Reacquires Full Ownership of Huda Beauty

Huda Kattan Reacquires Full Ownership of Huda Beauty

June 5, 2025
Syntax, Best Practices, & SEO

Syntax, Best Practices, & SEO

July 31, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • The Browser Company launches a $20 monthly subscription for its AI-powered browser
  • Image Annotation Services: The Comprehensive Guide 2025
  • Safety Observation App Development Cost Guide 2025
  • Jules, Google’s asynchronous AI coding agent, is out of public beta
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?