• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Thursday, June 11, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Google Marketing

Apigee Operator for Kubernetes and GKE Inference Gateway integration for Auth and AI/LLM policies

Josh by Josh
September 26, 2025
in Google Marketing
0
Apigee Operator for Kubernetes and GKE Inference Gateway integration for Auth and AI/LLM policies


No AI/Agents without APIs!

Many users interact with generative AI daily without realizing the crucial role of underlying APIs in making these powerful capabilities accessible. APIs unlock the power of generative AI by making models available to both automated agents and human users. Complex business processes leveraged internally and externally are built by connecting multiple APIs in agentic workflows.

GKE Inference Gateway

The Google Kubernetes Engine (GKE) Inference Gateway is an extension to the GKE Gateway that provides optimized routing and load balancing for serving generative Artificial Intelligence (AI) workloads. It simplifies the deployment, management, and observability of AI inference workloads. The GKE Inference Gateway offers:

  • Optimized load balancing for inference: GKE Inference Gateway distributes requests to optimize AI model serving using metrics from model servers.
  • Dynamic LoRA fine-tuned model serving: GKE Inference Gateway supports serving dynamic LoRA (Low-Rank Adaptation) fine-tuned models on a common accelerator, reducing the number of GPUs and TPUs required to serve models through multiplexing.
  • Optimized autoscaling for inference: The GKE Horizontal Pod Autoscaler (HPA) uses model server metrics to autoscale.
  • Model-aware routing: The Gateway routes inference requests based on model names defined in OpenAI API specifications within your GKE cluster.
  • Model-specific serving Criticality: The GKE Inference Gateway lets you specify the serving Criticality of AI models to prioritize latency-sensitive requests over latency-tolerant batch inference jobs.
  • Integrated AI safety: GKE Inference Gateway integrates with Google Cloud Model Armor to apply AI safety checks to model prompts and responses.
  • Inference observability: GKE Inference Gateway provides observability metrics for inference requests, such as request rate, latency, errors, and saturation.

Leveraging the GCPTrafficExtension

The challenge

Most enterprise customers using the GKE Inference Gateway would like to secure and optimize their agentic/AI workloads. They want to publish and monetize their Agentic APIs, while accessing the high quality API governance features offered by Apigee as part of their Agentic API commercialization strategy.

The solution

GKE Inference Gateway solves this challenge through the introduction of the GCPTrafficExtension resource, enabling the GKE Gateway to make a “sideways” call to a policy decision point (PDP) through the service extension (or ext-proc) mechanism.

The Apigee Operator for Kubernetes leverages this service extension mechanism to enforce Apigee policies on API traffic flowing through the GKE Inference Gateway. This seamless integration provides GKE Inference Gateway users with the benefits of Apigee’s API governance.

The GKE Inference Gateway and Apigee Apigee Operator for Kubernetes work together through the following steps:

  • Provision Apigee: The GKE Inference Gateway administrator provisions an Apigee instance on Google Cloud.
  • Install the Apigee Operator for Kubernetes: The administrator installs the Apigee Operator for Kubernetes within their GKE cluster and connects it to the newly provisioned Apigee instance.
  • Create an ApigeeBackendService: An ApigeeBackendService resource is created. This resource acts as a proxy for the Apigee dataplane.
  • Apply the Traffic Extension: The ApigeeBackendService is then referenced as the backendRef within a GCPTrafficExtension.
  • Enforce Policies: The GCPTrafficExtension is applied to the GKE Inference Gateway, allowing Apigee to enforce policies on the API traffic flowing through the gateway.

Apigee Operator for Kubernetes: API management for LLMs

Apigee provides a comprehensive API management layer for traditional transactional APIs and Large Language Models (LLMs) across Google Cloud, other public clouds, and on-premise infrastructure. This platform offers a powerful policy engine, full API lifecycle management, and advanced AI/ML-powered analytics. Apigee is recognized as a Leader for API management in the Gartner Magic Quadrant, serving large enterprises with complex API needs.

Through this new integration with GKE Inference Gateway, GKE users can leverage Apigee’s full suite of features to manage, govern, and monetize their AI workload through APIs. This includes the ability for API producers to package APIs into API Products available to developers through self-service developer portals. Users also gain access to Apigee’s value-added services, such as API security and detailed API analytics.

With the integration, GKE users can access Apigee policies governing:

  • API keys
  • Quotas
  • Rate limiting
  • Google access tokens
  • Key value stores
  • OpenAPI spec validation
  • Traffic spikes
  • Custom javascript
  • Response caching
  • External service callouts

The Apigee Operator for Kubernetes used in this integration also supports admin template rules, letting organization administrators enforce policy rules across their organization. For example, an organization admin can require that certain policies be applied to all APIs, or specify a list of policies that can’t be used with the organization’s APIs.

Future plans include support for Apigee AI policies governing:

  • Model Armor security
  • Semantic caching
  • Token counting and enforcement
  • Prompt-based model routing

No AI without APIs – Reprise

By leveraging Apigee’s best-in-class API management and security capabilities through the GKE Inference Gateway, enterprises can now unify their AI serving and API governance layers. With Apigee’s full-featured API management platform at your disposal, you can focus on your core mission: running your inference engine on GKE to take advantage of the best-in-class AI infrastructure available in public clouds.



Source_link

READ ALSO

Step inside 50 new digital exhibitions from Africa on Google Arts & Culture

Google won’t just admit it’s feeding YouTube creators to its music AI

Related Posts

Step inside 50 new digital exhibitions from Africa on Google Arts & Culture
Google Marketing

Step inside 50 new digital exhibitions from Africa on Google Arts & Culture

June 11, 2026
Google won’t just admit it’s feeding YouTube creators to its music AI
Google Marketing

Google won’t just admit it’s feeding YouTube creators to its music AI

June 11, 2026
DiffusionGemma: The Developer Guide – Google Developers Blog
Google Marketing

DiffusionGemma: The Developer Guide – Google Developers Blog

June 11, 2026
Introducing DiffusionGemma
Google Marketing

Introducing DiffusionGemma

June 10, 2026
Google will save your Lens photos, Search Live recordings, and Translate audio for AI training
Google Marketing

Google will save your Lens photos, Search Live recordings, and Translate audio for AI training

June 10, 2026
The Future Report: UK Teen Research Launch
Google Marketing

The Future Report: UK Teen Research Launch

June 10, 2026
Next Post
A Detailed Breakdown of Don Julio’s Creator-first 194구 Campaign

A Detailed Breakdown of Don Julio’s Creator-first 194구 Campaign

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

How Brands Win in AI Search (2026)

How Brands Win in AI Search (2026)

February 2, 2026
7 Best 3D Printing Software I Found for Beginners and Pros

7 Best 3D Printing Software I Found for Beginners and Pros

November 20, 2025
The Power of Multi-Channel Discovery in Best Answer Marketing – TopRank® Marketing

The Power of Multi-Channel Discovery in Best Answer Marketing – TopRank® Marketing

October 26, 2025
We’re partnering with multiple national teams ahead of soccer’s biggest global showdown.

We’re partnering with multiple national teams ahead of soccer’s biggest global showdown.

March 26, 2026

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Which Social Media APIs Support Multi-Platform Posting? 6 Free + Paid Options
  • Meet ‘North Mini Code’: Cohere’s 30B Open-Weight Mixture-of-Experts Model With 3B Active Parameters for Agentic Coding
  • Who knew a spiral could do so much? Pentagram did, in this joyful Tokyo museum identity — BP&O
  • 5 Best Scheduling Software that Integrate with QuickBooks
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions