• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Tuesday, December 2, 2025
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Technology And Software

ScaleOps' new AI Infra Product slashes GPU costs for self-hosted enterprise LLMs by 50% for early adopters

Josh by Josh
November 20, 2025
in Technology And Software
0
ScaleOps' new AI Infra Product slashes GPU costs for self-hosted enterprise LLMs by 50% for early adopters
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter



ScaleOps has expanded its cloud resource management platform with a new product aimed at enterprises operating self-hosted large language models (LLMs) and GPU-based AI applications.

READ ALSO

The 157 Best Cyber Week Deals—Save up to 57% Off Gear We Love

YouTube releases its first-ever recap of videos you’ve watched

The AI Infra Product announced today, extends the company’s existing automation capabilities to address a growing need for efficient GPU utilization, predictable performance, and reduced operational burden in large-scale AI deployments.

The company said the system is already running in enterprise production environments and delivering major efficiency gains for early adopters, reducing GPU costs by between 50% and 70%, according to the company. The company does not publicly list enterprise pricing for this solution and instead invites interested customers to receive a custom quote based on their operation size and needs here.

In explaining how the system behaves under heavy load, Yodar Shafrir, CEO and Co-Founder of ScaleOps, said in an email to VentureBeat that the platform uses “proactive and reactive mechanisms to handle sudden spikes without performance impact,” noting that its workload rightsizing policies “automatically manage capacity to keep resources available.”

He added that minimizing GPU cold-start delays was a priority, emphasizing that the system “ensures instant response when traffic surges,” particularly for AI workloads where model load times are substantial.

Expanding Resource Automation to AI Infrastructure

Enterprises deploying self-hosted AI models face performance variability, long load times, and persistent underutilization of GPU resources. ScaleOps positioned the new AI Infra Product as a direct response to these issues.

The platform allocates and scales GPU resources in real time and adapts to changes in traffic demand without requiring alterations to existing model deployment pipelines or application code.

According to ScaleOps, the system manages production environments for organizations including Wiz, DocuSign, Rubrik, Coupa, Alkami, Vantor, Grubhub, Island, Chewy, and several Fortune 500 companies.

The AI Infra Product introduces workload-aware scaling policies that proactively and reactively adjust capacity to maintain performance during demand spikes. The company stated that these policies reduce the cold-start delays associated with loading large AI models, which improves responsiveness when traffic increases.

Technical Integration and Platform Compatibility

The product is designed for compatibility with common enterprise infrastructure patterns. It works across all Kubernetes distributions, major cloud platforms, on-premises data centers, and air-gapped environments. ScaleOps emphasized that deployment does not require code changes, infrastructure rewrites, or modifications to existing manifests.

Shafrir said the platform “integrates seamlessly into existing model deployment pipelines without requiring any code or infrastructure changes,” and he added that teams can begin optimizing immediately with their existing GitOps, CI/CD, monitoring, and deployment tooling.

Shafrir also addressed how the automation interacts with existing systems. He said the platform operates without disrupting workflows or creating conflicts with custom scheduling or scaling logic, explaining that the system “doesn’t change manifests or deployment logic” and instead enhances schedulers, autoscalers, and custom policies by incorporating real-time operational context while respecting existing configuration boundaries.

Performance, Visibility, and User Control

The platform provides full visibility into GPU utilization, model behavior, performance metrics, and scaling decisions at multiple levels, including pods, workloads, nodes, and clusters. While the system applies default workload scaling policies, ScaleOps noted that engineering teams retain the ability to tune these policies as needed.

In practice, the company aims to reduce or eliminate the manual tuning that DevOps and AIOps teams typically perform to manage AI workloads. Installation is intended to require minimal effort, described by ScaleOps as a two-minute process using a single helm flag, after which optimization can be enabled through a single action.

Cost Savings and Enterprise Case Studies

ScaleOps reported that early deployments of the AI Infra Product have achieved GPU cost reductions of 50–70% in customer environments. The company cited two examples:

  • A major creative software company operating thousands of GPUs averaged 20% utilization before adopting ScaleOps. The product increased utilization, consolidated underused capacity, and enabled GPU nodes to scale down. These changes reduced overall GPU spending by more than half. The company also reported a 35% reduction in latency for key workloads.

  • A global gaming company used the platform to optimize a dynamic LLM workload running on hundreds of GPUs. According to ScaleOps, the product increased utilization by a factor of seven while maintaining service-level performance. The customer projected $1.4 million in annual savings from this workload alone.

ScaleOps stated that the expected GPU savings typically outweigh the cost of adopting and operating the platform, and that customers with limited infrastructure budgets have reported fast returns on investment.

Industry Context and Company Perspective

The rapid adoption of self-hosted AI models has created new operational challenges for enterprises, particularly around GPU efficiency and the complexity of managing large-scale workloads. Shafrir described the broader landscape as one in which “cloud-native AI infrastructure is reaching a breaking point.”

“Cloud-native architectures unlocked great flexibility and control, but they also introduced a new level of complexity,” he said in the announcement. “Managing GPU resources at scale has become chaotic—waste, performance issues, and skyrocketing costs are now the norm. The ScaleOps platform was built to fix this. It delivers the complete solution for managing and optimizing GPU resources in cloud-native environments, enabling enterprises to run LLMs and AI applications efficiently, cost-effectively, and while improving performance.”

Shafrir added that the product brings together the full set of cloud resource management functions needed to manage diverse workloads at scale. The company positioned the platform as a holistic system for continuous, automated optimization.

A Unified Approach for the Future

With the addition of the AI Infra Product, ScaleOps aims to establish a unified approach to GPU and AI workload management that integrates with existing enterprise infrastructure.

The platform’s early performance metrics and reported cost savings suggest a focus on measurable efficiency improvements within the expanding ecosystem of self-hosted AI deployments.



Source_link

Related Posts

The 157 Best Cyber Week Deals—Save up to 57% Off Gear We Love
Technology And Software

The 157 Best Cyber Week Deals—Save up to 57% Off Gear We Love

December 2, 2025
YouTube releases its first-ever recap of videos you’ve watched
Technology And Software

YouTube releases its first-ever recap of videos you’ve watched

December 2, 2025
What Does A Freelance Copywriter Do? A Complete Guide For Beginners
Technology And Software

What Does A Freelance Copywriter Do? A Complete Guide For Beginners

December 2, 2025
Arcee aims to reboot U.S. open source AI with new Trinity models released under Apache 2.0
Technology And Software

Arcee aims to reboot U.S. open source AI with new Trinity models released under Apache 2.0

December 2, 2025
The best charities for helping animals in 2025
Technology And Software

The best charities for helping animals in 2025

December 2, 2025
Up to 50 percent off on Star Wars, Disney, Harry Potter and more toy sets
Technology And Software

Up to 50 percent off on Star Wars, Disney, Harry Potter and more toy sets

December 2, 2025
Next Post
The ‘Why’ of Meta Advertising

The 'Why' of Meta Advertising

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR NEWS

Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
7 Best EOR Platforms for Software Companies in 2025

7 Best EOR Platforms for Software Companies in 2025

June 21, 2025

EDITOR'S PICK

Google’s Pixel Watch 4, Fold Pro 10 and Buds 2a are rumored to launch later than the rest of its new gear

Google’s Pixel Watch 4, Fold Pro 10 and Buds 2a are rumored to launch later than the rest of its new gear

August 5, 2025
Female-founded semiconductor AI startup SixSense raises $8.5M

Female-founded semiconductor AI startup SixSense raises $8.5M

August 1, 2025
How to Draw in Instagram Chat

How to Draw in Instagram Chat

October 27, 2025
Mistral AI Releases Devstral 2507 for Code-Centric Language Modeling

Mistral AI Releases Devstral 2507 for Code-Centric Language Modeling

July 11, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Why So Many Fatal Police Pursuits Start With Minor Traffic Stops — And Why Experts Say It’s Time for National Reform
  • The 157 Best Cyber Week Deals—Save up to 57% Off Gear We Love
  • Sandisk Offers Content Creators the “Space to Hold More”
  • DeepSeek Researchers Introduce DeepSeek-V3.2 and DeepSeek-V3.2-Speciale for Long Context Reasoning and Agentic Workloads
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?