• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Tuesday, March 10, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Technology And Software

Baidu unveils proprietary ERNIE 5 beating GPT-5 performance on charts, document understanding and more

Josh by Josh
November 14, 2025
in Technology And Software
0
Baidu unveils proprietary ERNIE 5 beating GPT-5 performance on charts, document understanding and more



Mere hours after OpenAI updated its flagship foundation model GPT-5 to GPT-5.1, promising reduced token usage overall and a more pleasant personality with more preset options, Chinese search giant Baidu unveiled its next-generation foundation model, ERNIE 5.0, alongside a suite of AI product upgrades and strategic international expansions.

READ ALSO

Andrej Karpathy's new open source 'autoresearch' lets you run hundreds of AI experiments a night — with revolutionary implications

Dutch intelligence services warn of Russian hackers targeting Signal and WhatsApp

The goal: to position as a global contender in the increasingly competitive enterprise AI market.

Announced at the company's Baidu World 2025 event, ERNIE 5.0 is a proprietary, natively omni-modal model designed to jointly process and generate content across text, images, audio, and video.

Unlike Baidu’s recently released ERNIE-4.5-VL-28B-A3B-Thinking, which is open source under an enterprise-friendly and permissive Apache 2.0 license, ERNIE 5.0 is a proprietary model and is available only via Baidu’s ERNIE Bot website (I needed to select it manuallyu from the model picker dropdown) and the Qianfan cloud platform application programming interface (API) for enterprise customers.

Alongside the model launch, Baidu introduced major updates to its digital human platform, no-code tools, and general-purpose AI agents — all targeted at expanding its AI footprint beyond China.

The company also introduced ERNIE 5.0 Preview 1022, a variant optimized for text-intensive tasks, alongside the general preview model that balances across modalities.

Baidu emphasized that ERNIE 5.0 represents a shift in how intelligence is deployed at scale, with CEO Robin Li stating: “When you internalize AI, it becomes a native capability and transforms intelligence from a cost into a source of productivity.”

Where ERNIE 5.0 outshines GPT-5 and Gemini 2.5 Pro

ERNIE 5.0’s benchmark results suggest that Baidu has achieved parity—or near-parity—with the top Western foundation models across a wide spectrum of tasks.

In public benchmark slides shared during the Baidu World 2025 event, ERNIE 5.0 Preview outperformed or matched OpenAI’s GPT-5-High and Google’s Gemini 2.5 Pro in multimodal reasoning, document understanding, and image-based QA, while also demonstrating strong language modeling and code execution abilities.

The company emphasized its ability to handle joint inputs and outputs across modalities, rather than relying on post-hoc modality fusion, which it framed as a technical differentiator.

On visual tasks, ERNIE 5.0 achieved leading scores on OCRBench, DocVQA, and ChartQA, three benchmarks that test document recognition, comprehension, and structured data reasoning.

Baidu claims the model beat both GPT-5-High and Gemini 2.5 Pro on these document and chart-based benchmarks, areas it describes as core to enterprise applications like automated document processing and financial analysis.

In image generation, ERNIE 5.0 tied or exceeded Google’s Veo3 across categories including semantic alignment and image quality, according to Baidu’s internal GenEval-based evaluation. Baidu claimed that the model’s multimodal integration allows it to generate and interpret visual content with greater contextual awareness than models relying on modality-specific encoders.

For audio and speech tasks, ERNIE 5.0 demonstrated competitive results on MM-AU and TUT2017 audio understanding benchmarks, as well as question answering from spoken language inputs. Its audio performance, while not as heavily emphasized as vision or text, suggests a broad capability footprint intended to support full-spectrum multimodal applications.

In language tasks, the model showed strong results on instruction following, factual question answering, and mathematical reasoning—core areas that define the enterprise utility of large language models.

The Preview 1022 variant of ERNIE 5.0, tailored for textual performance, showed even stronger language-specific results in early developer access. While Baidu does not claim broad superiority in general language reasoning, its internal evaluations suggest that ERNIE 5.0 Preview 1022 closes the gap with top-tier English-language models and outperforms them in Chinese-language performance.

While Baidu did not release full benchmark details or raw scores publicly, its performance positioning suggests a deliberate attempt to frame ERNIE 5.0 not as a niche multimodal system but as a flagship model competitive with the largest closed models in general-purpose reasoning.

Where Baidu claims a clear lead is in structured document understanding, visual chart reasoning, and integration of multiple modalities into a single, native modeling architecture. Independent verification of these results remains pending, but the breadth of claimed capabilities positions ERNIE 5.0 as a serious alternative in the multimodal foundation model landscape.

Enterprise Pricing Strategy

ERNIE 5.0 is positioned at the premium end of Baidu’s model pricing structure. The company has released specific pricing for API usage on its Qianfan platform, aligning the cost with other top-tier offerings from Chinese competitors like Alibaba.

Model

Input Cost (per 1K tokens)

Output Cost (per 1K tokens)

Source

ERNIE 5.0

$0.00085 (¥0.006)

$0.0034 (¥0.024)

Qianfan

ERNIE 4.5 Turbo (ex.)

$0.00011 (¥0.0008)

$0.00045 (¥0.0032)

Qianfan

Qwen3 (Coder ex.)

$0.00085 (¥0.006)

$0.0034 (¥0.024)

Qianfan

The contrast in cost between ERNIE 5.0 and earlier models such as ERNIE 4.5 Turbo underscores Baidu’s strategy to differentiate between high-volume, low-cost models and high-capability models designed for complex tasks and multimodal reasoning.

Compared to other U.S. alternatives, it remains mid-range in pricing:

Model

Input (/1 M tokens)

Output (/1 M tokens)

Source

GPT-5.1

$1.25

$10.00

OpenAI

ERNIE 5.0

$0.85

$3.40

Qianfan

ERNIE 4.5 Turbo (ex.)

$0.11

$0.45

Qianfan

Claude Opus 4.1

$15.00

$75.00

Anthropic

Gemini 2.5 Pro

$1.25 (≤200k) / $2.50 (>200k)

$10.00 (≤200k) / $15.00 (>200k)

Google Vertex AI Pricing

Grok 4 (grok-4-0709)

$3.00

$15.00

xAI API

Global Expansion: Products and Platforms

In tandem with the model release, Baidu is expanding internationally:

  • GenFlow 3.0, now with 20M+ users, is the company’s largest general-purpose AI agent and features enhanced memory and multimodal task handling.

  • Famou, a self-evolving agent capable of dynamically solving complex problems, is now commercially available via invite.

  • MeDo, the international version of Baidu’s no-code builder Miaoda, is live globally via medo.dev.

  • Oreate, a productivity workspace with document, slide, image, video, and podcast support, has reached over 1.2M users worldwide.

Baidu’s digital human platform, already rolled out in Brazil, is also part of the global push. According to company data, 83% of livestreamers during this year’s “Double 11” shopping event in China used Baidu’s digital human tech, contributing to a 91% increase in GMV.

Meanwhile, Baidu’s autonomous ride-hailing service Apollo Go has surpassed 17 million rides, operating driverless fleets in 22 cities and claiming the title of the world’s largest robotaxi network.

Open-Source Vision-Language Model Garners Industry Attention

Two days before the flagship ERNIE 5.0 event, Baidu also released an open-source multimodal model under the Apache 2.0 license: ERNIE-4.5-VL-28B-A3B-Thinking.

As reported by my colleague Michael Nuñez at VentureBeat, the model activates just 3 billion parameters while maintaining a total of 28 billion, using a Mixture-of-Experts (MoE) architecture for efficient inference.

Key technical innovations include:

  • “Thinking with Images”, which enables dynamic zoom-based visual analysis

  • Support for chart interpretation, document understanding, visual grounding, and temporal awareness in video

  • Runtime on a single 80GB GPU, making it accessible to mid-sized organizations

  • Full compatibility with Transformers, vLLM, and Baidu’s FastDeploy toolkits

This release adds pressure on closed-source competitors. With Apache 2.0 licensing, ERNIE-4.5-VL-28B-A3B-Thinking becomes a viable foundation model for commercial applications without licensing restrictions — something few high-performing models in this class offer.

Community Feedback and Baidu’s Response

Following the launch of ERNIE 5.0, developer and AI evaluator Lisan al Gaib (@scaling01) posted a mixed review on X. While initially impressed by the model’s benchmark performance, they reported a persistent issue where ERNIE 5.0 would repeatedly invoke tools — even when explicitly instructed not to — during SVG generation tasks.

“ERNIE 5.0 benchmarks looked insane until I tested it… unfortunately it’s RL braindamaged or they have a serious issue with their chat platform / system prompt,” Lisan wrote.

In a matter of hours, Baidu’s developer-focused support account, @ErnieforDevs, responded:

“Thanks for the feedback! It’s a known bug — certain syntax can consistently trigger it. We’re working on a fix. You can try rephrasing or changing the prompt to avoid it for now.”

The quick turnaround reflects Baidu’s increasing emphasis on developer communication, especially as it courts international users through both proprietary and open-source offerings.

Outlook for Baidu and its ERNIE foundational LLM family

Baidu’s ERNIE 5.0 marks a strategic escalation in the global foundation model race. With performance claims that put it on par with the most advanced systems from OpenAI and Google, and a mix of premium pricing and open-access alternatives, Baidu is signaling its ambition to become not just a domestic AI leader, but a credible global infrastructure provider.

At a time when enterprise AI users are increasingly demanding multimodal performance, flexible licensing, and deployment efficiency, Baidu’s two-track approach—premium hosted APIs and open-source releases—may broaden its appeal across both corporate and developer communities.

Whether the company’s performance claims hold up under third-party testing remains to be seen. But in a landscape shaped by rising costs, model complexity, and compute bottlenecks, ERNIE 5.0 and its supporting ecosystem give Baidu a competitive position in the next wave of AI deployment.



Source_link

Related Posts

Andrej Karpathy's new open source 'autoresearch' lets you run hundreds of AI experiments a night — with revolutionary implications
Technology And Software

Andrej Karpathy's new open source 'autoresearch' lets you run hundreds of AI experiments a night — with revolutionary implications

March 10, 2026
Dutch intelligence services warn of Russian hackers targeting Signal and WhatsApp
Technology And Software

Dutch intelligence services warn of Russian hackers targeting Signal and WhatsApp

March 9, 2026
Our Favorite Wireless Headphones Are $60 Off
Technology And Software

Our Favorite Wireless Headphones Are $60 Off

March 9, 2026
The 2027 Chevy Bolt is the McRib of the automotive world
Technology And Software

The 2027 Chevy Bolt is the McRib of the automotive world

March 9, 2026
Dynamic UI for dynamic AI: Inside the emerging A2UI model
Technology And Software

Dynamic UI for dynamic AI: Inside the emerging A2UI model

March 9, 2026
Anthropic vs. OpenAI vs. the Pentagon: the AI safety fight shaping our future
Technology And Software

Anthropic vs. OpenAI vs. the Pentagon: the AI safety fight shaping our future

March 9, 2026
Next Post
How to Earn Citations & Mentions in AI Search

How to Earn Citations & Mentions in AI Search

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Google announced the next step in its nuclear energy plans 

Google announced the next step in its nuclear energy plans 

August 20, 2025

EDITOR'S PICK

Sam Altman says ChatGPT will soon allow erotica for adult users

Sam Altman says ChatGPT will soon allow erotica for adult users

October 14, 2025
Best Time to Post on Social Media in 2025: Every Platform

Best Time to Post on Social Media in 2025: Every Platform

June 16, 2025
How to Build an Online Presence That Attracts Clients (Even When You’re Not Actively Networking)

How to Build an Online Presence That Attracts Clients (Even When You’re Not Actively Networking)

June 17, 2025
Helping scientists run complex data analyses without writing code | MIT News

Helping scientists run complex data analyses without writing code | MIT News

October 18, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Andrej Karpathy's new open source 'autoresearch' lets you run hundreds of AI experiments a night — with revolutionary implications
  • A First Look at The National Ballet of Canada’s 75th Anniversary
  • Introducing Wednesday Build Hour – Google Developers Blog
  • The Scoop: NYT interview with Nike’s Elliott Hill shows art of CEO profile
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions