• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Sunday, April 26, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Google Marketing

Own your AI: Learn how to fine-tune Gemma 3 270M and run it on-device

Josh by Josh
October 9, 2025
in Google Marketing
0
Own your AI: Learn how to fine-tune Gemma 3 270M and run it on-device


OYOAI_Wagtial_RD2-V01

Gemma is a collection of lightweight, state-of-the-art open models built from the same technology that powers our Gemini models. Available in a range of sizes, anyone can adapt and run them on their own infrastructure. This combination of performance and accessibility has led to over 250 million downloads and 85,000 published community variations for a wide range of tasks and domains.

You don’t need expensive hardware to create highly specialized, custom models. Gemma 3 270M’s compact size allows you to quickly fine-tune it for new use cases then deploy it on-device, giving you flexibility over model development and full control of a powerful tool.

To show how simple this is, this post walks through an example of training your own model to translate text to emoji and test it in a web app. You can even teach it the specific emojis you use in real life, resulting in a personal emoji generator. Try it out in the live demo.

Sorry, your browser doesn’t support playback for this video

We’ll walk you through the end-to-end process of creating a task-specific model in under an hour. You will learn how to:

  1. Fine-tune the model: Train Gemma 3 270M on a custom dataset to create a personal “emoji translator”
  2. Quantize and convert the model: Optimize the model for on-device inference, reducing its memory footprint to under 300MB of memory
  3. Deploy in a web app: Run the model client-side in a simple web app using MediaPipe or Transformers.js

Step 1: Customize model behavior using fine-tuning

Out of the box, LLMs are generalists. If you ask Gemma to translate text to emoji, you might get more than you asked for, like conversational filler.

Prompt:
Translate the following text into a creative combination of 3-5 emojis: “what a fun party”

Model output (example):
Sure! Here is your emoji: 🥳🎉🎈

For our app, Gemma needs to output just emojis. While you could try complex prompt engineering, the most reliable way to enforce a specific output format and teach the model new knowledge is fine-tuning it on example data. So, to teach the model to use specific emojis, you would train it on a dataset containing text and emoji examples.

Models learn better with the more examples you provide, so you can easily make your dataset more robust by prompting AI to generate different text phrases for the same emoji output. For fun, we did this with emojis we associate with pop songs and fandoms:

creating-dataset-for-finetuning (2)

If you want the model to memorize specific emoji, provide more examples in the dataset.

Fine-tuning a model used to require massive amounts of VRAM. However, with Quantized Low-Rank Adaptation (QLoRA), a Parameter-Efficient Fine-Tuning (PEFT) technique, we only update a small number of weights. This drastically reduces memory requirements, allowing you to fine-tune Gemma 3 270M in minutes when using no-cost T4 GPU acceleration in Google Colab.

Get started with an example dataset or populate the template with your own emojis. You can then run the fine-tuning notebook to load the dataset, train the model, and test your new model’s performance against the original.

Step 2: Quantize and convert the model for the web

Now that you have a custom model, what can you do with it? Since we usually use emojis on mobile devices or computers, it makes sense to deploy your model in an on-device app.

The original model, while small, is still over 1GB. To ensure a fast-loading user experience, we need to make it smaller. We can do this using quantization, a process that reduces the precision of the model’s weights (e.g., from 16-bit to 4-bit integers). This significantly shrinks the file size with minimal impact on performance for many tasks.

gemma-quantization-for-ondevice

Smaller models result in a faster-loading app and better experience for end users.

To get your model ready for a web app, quantize and convert it in a single step using either the LiteRT conversion notebook for use with MediaPipe or the ONNX conversion notebook for use with Transformers.js. These frameworks make it possible to run LLMs client-side in the browser by leveraging WebGPU, a modern web API that gives apps access to a local device’s hardware for computation, eliminating the need for complex server setups and per-call inference costs.

Step 3: Run the model in the browser

You can now run your customized model directly in the browser! Download our example web app and change one line of code to plug in your new model.

Both MediaPipe and Transformers.js make this straightforward. Here’s an example of the inference task running inside the MediaPipe worker:

// Initialize the MediaPipe Task
const genai = await FilesetResolver.forGenAiTasks('https://cdn.jsdelivr.net/npm/@mediapipe/tasks-genai@latest/wasm');
llmInference = await LlmInference.createFromOptions(genai, {
    baseOptions: { modelAssetPath: 'path/to/yourmodel.task' }
});

// Format the prompt and generate a response
const prompt = `Translate this text to emoji: what a fun party!`;
const response = await llmInference.generateResponse(prompt);

JavaScript

Once the model is cached on the user’s device, subsequent requests run locally with low latency, user data remains completely private, and your app functions even when offline.

Love your app? Share it by uploading it to Hugging Face Spaces (just like the demo).

What’s next

You don’t have to be an AI expert or data scientist to create a specialized AI model. You can enhance Gemma model performance using relatively small datasets—and it takes minutes, not hours.

We hope that you’re inspired to create your own model variations. By using these techniques, you can build powerful AI applications that are not only customized for your needs but also deliver a superior user experience: one that is fast, private, and accessible to anyone, anywhere.

The complete source code and resources for this project are available to help you get started:

  • Fine-tune Gemma efficiently with QLoRA in Colab
  • Convert Gemma 3 270M for use with MediaPipe LLM Inference API in Colab
  • Convert Gemma 3 270M for use with Transformers.js in Colab
  • Download the demo code on GitHub
  • Explore more web AI demos from the Gemma Cookbook and chrome.dev
  • Learn more about the Gemma 3 family of models and their on-device capabilities



Source_link

READ ALSO

Google shares trending fashion and beauty searches for spring 2026

3 easy ways to shop for spring with Google

Related Posts

Google shares trending fashion and beauty searches for spring 2026
Google Marketing

Google shares trending fashion and beauty searches for spring 2026

April 26, 2026
3 easy ways to shop for spring with Google
Google Marketing

3 easy ways to shop for spring with Google

April 26, 2026
New updates to the Gemini app, April 2026
Google Marketing

New updates to the Gemini app, April 2026

April 25, 2026
The US gets the worst phones
Google Marketing

The US gets the worst phones

April 25, 2026
7 highlights and announcements from Google Cloud Next ‘26
Google Marketing

7 highlights and announcements from Google Cloud Next ‘26

April 25, 2026
How to use Gemini to tackle your spring cleaning list
Google Marketing

How to use Gemini to tackle your spring cleaning list

April 25, 2026
Next Post
RA3: Mid-Training with Temporal Action Abstractions for Faster Reinforcement Learning (RL) Post-Training in Code LLMs

RA3: Mid-Training with Temporal Action Abstractions for Faster Reinforcement Learning (RL) Post-Training in Code LLMs

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

Gmail’s spam filter and automatic sorting are broken

Gmail’s spam filter and automatic sorting are broken

January 25, 2026
RAG System Implementation for Retail: Complete Guide

RAG System Implementation for Retail: Complete Guide

March 19, 2026
Winter Warmth for Pets, Crafted by Heads Up For Tails

Winter Warmth for Pets, Crafted by Heads Up For Tails

December 17, 2025
How to Configure a Ceph Storage Cluster in Proxmox VE?

How to Configure a Ceph Storage Cluster in Proxmox VE?

September 24, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • AI Decisioning Use Cases That Drive Revenue
  • Google shares trending fashion and beauty searches for spring 2026
  • 3 fixes for a splintered brand
  • Esketamine: How to Choose the Right Path for Treatment-Resistant Depression
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions