• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Monday, March 9, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Google Marketing

From Fine-Tuning to Production: A Scalable Embedding Pipeline with Dataflow

Josh by Josh
September 5, 2025
in Google Marketing
0
From Fine-Tuning to Production: A Scalable Embedding Pipeline with Dataflow


The world of AI is moving at an exciting pace, and embeddings are at the core of many modern applications like semantic search and Retrieval Augmented Generation (RAG). Today, we’re excited to discuss how you can leverage Google’s new highly efficient, 308M parameter open embedding model, EmbeddingGemma. While its small size makes it perfect for on-device applications, this same efficiency unlocks powerful new possibilities for the cloud, especially when it comes to customization through fine-tuning. We’ll show you how to use EmbeddingGemma with Google Cloud’s Dataflow and vector databases like AlloyDB to build a scalable, real-time knowledge ingestion pipeline.

The power of embeddings and Dataflow

Embeddings are numerical vector representations of data that capture the underlying relationships between words and concepts. They are the cornerstone of applications that need to understand information on a deeper, conceptual level, from searching for documents that are semantically similar to a query to providing relevant context for Large Language Models (LLMs) in RAG systems.

To power these applications, you need a robust knowledge ingestion pipeline that can process unstructured data, convert it into embeddings, and load it into a specialized vector database. This is where Dataflow can help by encapsulating these steps into a single managed pipeline.

Using a small, highly efficient open model like EmbeddingGemma at the core of your pipeline makes the entire process self-contained, which can simplify management by eliminating the need for external network calls to other services for the embedding step. Because it’s an open model, it can be hosted entirely within Dataflow. This provides the confidence to securely process large-scale, private datasets.

Beyond these operational benefits, EmbeddingGemma is also fine-tunable, allowing you to customize it for your specific data embedding needs; you can find a fine-tuning example here. Quality is just as important as scalability, and EmbeddingGemma excels here as well. It is the highest-ranking text-only multilingual embedding model under 500M parameters on the Massive Text Embedding Benchmark (MTEB) Multilingual leaderboard.

Dataflow is a fully managed, autoscaling platform for unified batch and streaming data processing. By including a model like EmbeddingGemma directly into a Dataflow pipeline, you gain several advantages:

  • Efficiency from data locality: Processing happens on the Dataflow workers, eliminating the need for remote procedure calls (RPC) to a separate inference service and avoiding problems from quotas and autoscaling multiple systems together. Your whole workflow can be bundled into a single set of workers, reducing your resource footprint.
  • Unified system: A single system handles autoscaling, observation, and monitoring, simplifying your operational overhead.
  • Scalability and simplicity: Dataflow automatically scales your pipeline up or down based on demand, and Apache Beam’s transforms reduce boilerplate code.

Building the ingestion pipeline with Dataflow ML

A typical knowledge ingestion pipeline consists of four phases: reading from a data source, preprocessing the data, generating embeddings, and writing to a vector database.

Dataflow's MLTransform

With Dataflow’s ‘MLTransform’, a powerful ‘PTransform’ for data preparation, this entire workflow can be implemented in just a few lines of code.

Generating Gemma Embeddings with MLTransform

Let’s walk through how to use the new Gemma model to generate text embeddings. This example, adapted from the EmbeddingGemma notebook, shows how to configure MLTransform to use a Hugging Face model and then write the results to AlloyDB where the embeddings can be used for semantic search. Databases like AlloyDB allow us to combine this semantic search with an additional structured search to provide high quality and relevant results.

First, we define the name of the model we’ll use for embeddings along with a transform specifying the columns we want to embed and the type of model we’re using.

import tempfile
import apache_beam as beam
from apache_beam.ml.transforms.base import MLTransform
from apache_beam.ml.transforms.embeddings.huggingface import SentenceTransformerEmbeddings

# The new Gemma model for generating embeddings. You can replace this with your fine tuned model just by changing this path.
text_embedding_model_name = 'google/embeddinggemma-300m'

# Define the embedding transform with our Gemma model
embedding_transform = SentenceTransformerEmbeddings(
    model_name=text_embedding_model_name, columns=['x']
)

Python

Once we’ve generated embeddings, we’ll pipe the output directly into our sink, which will usually be a vector database. To write these embeddings, we will define a config-driven VectorDatabaseWriteTransform.

In this case, we will use AlloyDB as our sink by passing in an AlloyDBVectorWriterConfig object. Dataflow supports writing to many vector databases, including AlloyDB, CloudSQL, and BigQuery, using just configuration objects.

# Define the config used to write to AlloyDB
alloydb_writer_config = AlloyDBVectorWriterConfig(
    connection_config=connection_config,
    table_name=table_name
)

# Build and run the pipeline
with beam.Pipeline() as pipeline:
  _ = (
      pipeline
      | "CreateData" >> beam.Create(content) # In production could be replaced by a transform to read from any source
      # MLTransform generates the embeddings
      | "Generate Embeddings" >> MLTransform(
          write_artifact_location=tempfile.mkdtemp()
      ).with_transform(embedding_transform)
      # The output is written to our vector database
      | 'Write to AlloyDB' >> VectorDatabaseWriteTransform(alloydb_writer_config)
  )

Python

This simple yet powerful pattern allows you to process massive datasets in parallel, generate embeddings with EmbeddingGemma – 308M parameters – and populate your vector database—all within a single, scalable, cost-efficient, and managed pipeline.

Get Started Today

By combining the latest Gemma models with the scalability of Dataflow and the vector search power of vector databases like AlloyDB, you can build sophisticated, next-generation AI applications with ease.

To learn more, explore the Dataflow ML documentation, especially documentation on preparing data and generating embeddings. You can also try a simple pipeline using EmbeddingGemma by following this notebook.

For large-scale, server-side applications, explore our state-of-the-art Gemini Embedding model via the Gemini API for maximum performance and capacity.

To learn more about EmbeddingGemma, read our launch announcement on the Google Developer blog.



Source_link

READ ALSO

Our statement on the Gavalas lawsuit

Drive with Star Trek on Waze

Related Posts

Our statement on the Gavalas lawsuit
Google Marketing

Our statement on the Gavalas lawsuit

March 9, 2026
Drive with Star Trek on Waze
Google Marketing

Drive with Star Trek on Waze

March 9, 2026
NotebookLM adds Cinematic Video Overviews
Google Marketing

NotebookLM adds Cinematic Video Overviews

March 9, 2026
Google faces wrongful death lawsuit after Gemini allegedly ‘coached’ man to die by suicide
Google Marketing

Google faces wrongful death lawsuit after Gemini allegedly ‘coached’ man to die by suicide

March 9, 2026
International Women’s Day and Women’s History Month
Google Marketing

International Women’s Day and Women’s History Month

March 8, 2026
Google isn’t waiting for a settlement — the 30 percent Android app store fee is dead
Google Marketing

Google isn’t waiting for a settlement — the 30 percent Android app store fee is dead

March 8, 2026
Next Post
Insights on an Evolving Brand Activation Landscape

Insights on an Evolving Brand Activation Landscape

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Google announced the next step in its nuclear energy plans 

Google announced the next step in its nuclear energy plans 

August 20, 2025

EDITOR'S PICK

Google is bringing AI-powered notification summaries to more Android devices

Google is bringing AI-powered notification summaries to more Android devices

December 6, 2025
Google’s Discover page now summarizes news with AI

Google’s Discover page now summarizes news with AI

July 16, 2025
Are We Ready for Production-Grade Apps With Vibe Coding? A Look at the Replit Fiasco

Are We Ready for Production-Grade Apps With Vibe Coding? A Look at the Replit Fiasco

July 23, 2025
How to Manage ESXi 8.0 Roles in the VMware Host Client?

How to Manage ESXi 8.0 Roles in the VMware Host Client?

October 5, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • The Scoop: NYT interview with Nike’s Elliott Hill shows art of CEO profile
  • Binance AI Agents WOTD Answers
  • Dutch intelligence services warn of Russian hackers targeting Signal and WhatsApp
  • VirtuaLover Image Generator Pricing & Features Overview
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions