• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Wednesday, October 8, 2025
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Google Marketing

Apigee Operator for Kubernetes and GKE Inference Gateway integration for Auth and AI/LLM policies

Josh by Josh
September 26, 2025
in Google Marketing
0
Apigee Operator for Kubernetes and GKE Inference Gateway integration for Auth and AI/LLM policies
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


No AI/Agents without APIs!

Many users interact with generative AI daily without realizing the crucial role of underlying APIs in making these powerful capabilities accessible. APIs unlock the power of generative AI by making models available to both automated agents and human users. Complex business processes leveraged internally and externally are built by connecting multiple APIs in agentic workflows.

GKE Inference Gateway

The Google Kubernetes Engine (GKE) Inference Gateway is an extension to the GKE Gateway that provides optimized routing and load balancing for serving generative Artificial Intelligence (AI) workloads. It simplifies the deployment, management, and observability of AI inference workloads. The GKE Inference Gateway offers:

  • Optimized load balancing for inference: GKE Inference Gateway distributes requests to optimize AI model serving using metrics from model servers.
  • Dynamic LoRA fine-tuned model serving: GKE Inference Gateway supports serving dynamic LoRA (Low-Rank Adaptation) fine-tuned models on a common accelerator, reducing the number of GPUs and TPUs required to serve models through multiplexing.
  • Optimized autoscaling for inference: The GKE Horizontal Pod Autoscaler (HPA) uses model server metrics to autoscale.
  • Model-aware routing: The Gateway routes inference requests based on model names defined in OpenAI API specifications within your GKE cluster.
  • Model-specific serving Criticality: The GKE Inference Gateway lets you specify the serving Criticality of AI models to prioritize latency-sensitive requests over latency-tolerant batch inference jobs.
  • Integrated AI safety: GKE Inference Gateway integrates with Google Cloud Model Armor to apply AI safety checks to model prompts and responses.
  • Inference observability: GKE Inference Gateway provides observability metrics for inference requests, such as request rate, latency, errors, and saturation.

Leveraging the GCPTrafficExtension

The challenge

Most enterprise customers using the GKE Inference Gateway would like to secure and optimize their agentic/AI workloads. They want to publish and monetize their Agentic APIs, while accessing the high quality API governance features offered by Apigee as part of their Agentic API commercialization strategy.

The solution

GKE Inference Gateway solves this challenge through the introduction of the GCPTrafficExtension resource, enabling the GKE Gateway to make a “sideways” call to a policy decision point (PDP) through the service extension (or ext-proc) mechanism.

The Apigee Operator for Kubernetes leverages this service extension mechanism to enforce Apigee policies on API traffic flowing through the GKE Inference Gateway. This seamless integration provides GKE Inference Gateway users with the benefits of Apigee’s API governance.

The GKE Inference Gateway and Apigee Apigee Operator for Kubernetes work together through the following steps:

  • Provision Apigee: The GKE Inference Gateway administrator provisions an Apigee instance on Google Cloud.
  • Install the Apigee Operator for Kubernetes: The administrator installs the Apigee Operator for Kubernetes within their GKE cluster and connects it to the newly provisioned Apigee instance.
  • Create an ApigeeBackendService: An ApigeeBackendService resource is created. This resource acts as a proxy for the Apigee dataplane.
  • Apply the Traffic Extension: The ApigeeBackendService is then referenced as the backendRef within a GCPTrafficExtension.
  • Enforce Policies: The GCPTrafficExtension is applied to the GKE Inference Gateway, allowing Apigee to enforce policies on the API traffic flowing through the gateway.

Apigee Operator for Kubernetes: API management for LLMs

Apigee provides a comprehensive API management layer for traditional transactional APIs and Large Language Models (LLMs) across Google Cloud, other public clouds, and on-premise infrastructure. This platform offers a powerful policy engine, full API lifecycle management, and advanced AI/ML-powered analytics. Apigee is recognized as a Leader for API management in the Gartner Magic Quadrant, serving large enterprises with complex API needs.

Through this new integration with GKE Inference Gateway, GKE users can leverage Apigee’s full suite of features to manage, govern, and monetize their AI workload through APIs. This includes the ability for API producers to package APIs into API Products available to developers through self-service developer portals. Users also gain access to Apigee’s value-added services, such as API security and detailed API analytics.

With the integration, GKE users can access Apigee policies governing:

  • API keys
  • Quotas
  • Rate limiting
  • Google access tokens
  • Key value stores
  • OpenAPI spec validation
  • Traffic spikes
  • Custom javascript
  • Response caching
  • External service callouts

The Apigee Operator for Kubernetes used in this integration also supports admin template rules, letting organization administrators enforce policy rules across their organization. For example, an organization admin can require that certain policies be applied to all APIs, or specify a list of policies that can’t be used with the organization’s APIs.

Future plans include support for Apigee AI policies governing:

  • Model Armor security
  • Semantic caching
  • Token counting and enforcement
  • Prompt-based model routing

No AI without APIs – Reprise

By leveraging Apigee’s best-in-class API management and security capabilities through the GKE Inference Gateway, enterprises can now unify their AI serving and API governance layers. With Apigee’s full-featured API management platform at your disposal, you can focus on your core mission: running your inference engine on GKE to take advantage of the best-in-class AI infrastructure available in public clouds.



Source_link

READ ALSO

Announcing the Genkit Extension for Gemini CLI

Gemini CLI extensions let you customize your command line

Related Posts

Announcing the Genkit Extension for Gemini CLI
Google Marketing

Announcing the Genkit Extension for Gemini CLI

October 8, 2025
Gemini CLI extensions let you customize your command line
Google Marketing

Gemini CLI extensions let you customize your command line

October 8, 2025
Big Tech is ‘donating’ to Trump’s ‘nonprofits’ 
Google Marketing

Big Tech is ‘donating’ to Trump’s ‘nonprofits’ 

October 8, 2025
Google AI Plus comes to 36 more countries around the world
Google Marketing

Google AI Plus comes to 36 more countries around the world

October 8, 2025
Google plans to launch new smart displays
Google Marketing

Google plans to launch new smart displays

October 8, 2025
AI Mode in Google Search expands to more than 40 new areas
Google Marketing

AI Mode in Google Search expands to more than 40 new areas

October 7, 2025
Next Post
A Detailed Breakdown of Don Julio’s Creator-first 194구 Campaign

A Detailed Breakdown of Don Julio’s Creator-first 194구 Campaign

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR NEWS

Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
7 Best EOR Platforms for Software Companies in 2025

7 Best EOR Platforms for Software Companies in 2025

June 21, 2025

EDITOR'S PICK

Grow a Garden Shroomie Pet Wiki

Grow a Garden Shroomie Pet Wiki

September 8, 2025
Website Maintenance Services

Website Maintenance Services

June 29, 2025
Pixel 10 introduces new chip, Tensor G5

Pixel 10 introduces new chip, Tensor G5

August 24, 2025
Re-Designing Your SEO Career – Moz

Re-Designing Your SEO Career – Moz

May 27, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • MIT Schwarzman College of Computing and MBZUAI launch international collaboration to shape the future of AI | MIT News
  • How Enterprise AI Applications Are Transforming Businesses?
  • Mastercard launches Small Business Navigator in Canada to Enable Small Business Resilience
  • Announcing the Genkit Extension for Gemini CLI
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?