• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Saturday, August 23, 2025
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Google Marketing

On-device small language models with multimodality, RAG, and Function Calling

Josh by Josh
June 6, 2025
in Google Marketing
0
On-device small language models with multimodality, RAG, and Function Calling
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


Last year Google AI Edge introduced support for on-device small language models (SLMs) with four initial models on Android, iOS, and Web. Today, we are excited to expand support to over a dozen models including the new Gemma 3 and Gemma 3n models, hosted on our new LiteRT Hugging Face community.

Gemma 3n, available via Google AI Edge as an early preview, is Gemma’s first multimodal on-device small language model supporting text, image, video, and audio inputs. Paired with our new Retrieval Augmented Generation (RAG) and Function Calling libraries, you have everything you need to prototype and build transformative AI features fully on the edge.

Sorry, your browser doesn’t support playback for this video

Let users control apps with on-device SLMs and our new function calling library

Broader model support

You can find our growing list of models to choose from in the LiteRT Hugging Face Community. Download any of these models and easily run them on-device with just a few lines of code. The models are fully optimized and converted for mobile and web. Full instructions on how to run these models can be found in our documentation and on each model card on Hugging Face.

To customize any of these models, you finetune the base model and then convert and quantize the model using the appropriate AI Edge libraries. We have a Colab showing every step you need to fine-tune and then convert Gemma 3 1B.

With the latest release of our quantization tools, we have new quantization schemes that allow for much higher quality int4 post training quantization. Compared to bf16, the default data type for many models, int4 quantization can reduce the size of language models by a factor of 2.5-4X while significantly decreasing latency and peak memory consumption.


Gemma 3 1B & Gemma 3n

Earlier this year, we introduced Gemma 3 1B. At only 529MB, this model can run up to 2,585 tokens per second pre-fill on the mobile GPU, allowing it to process up to a page of content in under a second. Gemma 3 1B’s small footprint allows it to support a wide range of devices and limits the size of files an end user would need to download in their application.

Today, we are thrilled to add an early preview of Gemma 3n to our collection of supported models. The 2B and 4B parameter variants will both support native text, image, video, and audio inputs. The text and image modalities are available on Hugging Face with audio to follow shortly.

Sorry, your browser doesn’t support playback for this video

Gemma 3n analyzing images fully on-device

Gemma 3n is great for enterprise use cases where developers have the full resources of the device available to them, allowing for larger models on mobile. Field technicians with no service could snap a photo of a part and ask a question. Workers in a warehouse or a kitchen could update inventory via voice while their hands were full.

Bringing context to conversations: On-device Retrieval Augmented Generation (RAG)

One of the most exciting new capabilities we’re bringing to Google AI Edge is robust support for on-device Retrieval Augmented Generation (RAG). RAG allows you to augment your small language model with data specific to your application, without the need for fine-tuning. From 1000 pages of information or 1000 photos, RAG can help find just the most relevant few pieces of data to feed to your model.

READ ALSO

Google’s first Gemini smart home speaker detailed in leak

Ruth Porat on AI and its applications in finance

The AI Edge RAG library works with any of our supported small language models. Furthermore it offers the flexibility to change any part of the RAG pipeline enabling custom databases, chunking methods, and retrieval functions. The AI Edge RAG library is available today on Android with more platforms to follow. This means your on-device generative AI applications can now be grounded in specific, user-relevant information, unlocking a new class of intelligent features.


Enabling action: On-device function calling

To make on-device language models truly interactive, we’re introducing on-device function calling. The AI Edge Function Calling library is available on Android today with more platforms to follow. The library includes all of the utilities you need to integrate with an on-device language model, register your application functions, parse the response, and call your functions. Check out the documentation to try it yourself.

This powerful feature enables your language models to intelligently decide when to call predefined functions or APIs within your application. For example, in our sample app, we demonstrate how function calling can be used to fill out a form through natural language. In the context of a medical app asking for pre-appointment patient history, the user dictates their personal information. With our function calling library and an on-device language model, the app converts the voice to text, extracts the relevant information, and then calls application specific functions to fill out the individual fields.

The function calling library can also be paired with our python tool simulation library. The tool simulation library aids you in creating a custom language model for your specific functions through synthetic data generation and evaluation, increasing the accuracy of function calling on-device.


What’s next

We will continue to support the latest and greatest small language models on the edge, including new modalities. Keep an eye on our LiteRT Hugging Face Community for new model releases. Our RAG and function calling libraries will continue to expand in functionality and supported platforms.

For more Google AI Edge news, read about the new LiteRT APIs and our new AI Edge Portal service for broad coverage on-device benchmarking and evals.

Explore this announcement and all Google I/O 2025 updates on io.google starting May 22.


Acknowledgements

We also want to thank the following Googlers for their support in these launches: Advait Jain, Akshat Sharma, Alan Kelly, Andrei Kulik, Byungchul Kim, Chunlei Niu, Chun-nien Chan, Chuo-Ling Chang, Claudio Basile, Cormac Brick, Ekaterina Ignasheva, Eric Yang, Fengwu Yao, Frank Ban, Gerardo Carranza, Grant Jensen, Haoliang Zhang, Henry Wang, Ho Ko, Ivan Grishchenko, Jae Yoo, Jingjiang Li, Jiuqiang Tang, Juhyun Lee, Jun Jiang, Kris Tonthat, Lin Chen, Lu Wang, Marissa Ikonomidis, Matthew Soulanille, Matthias Grundmann, Milen Ferev, Mogan Shieh, Mohammadreza Heydary, Na Li, Pauline Sho, Pedro Gonnet, Ping Yu, Pulkit Bhuwalka, Quentin Khan, Ram Iyengar, Raman Sarokin, Rishika Sinha, Ronghui Zhu, Sachin Kotwani, Sebastian Schmidt, Steven Toribio, Suleman Shahid, T.J. Alumbaugh, Tenghui Zhu, Terry (Woncheol) Heo, Tyler Mullen, Vitalii Dziuba, Wai Hon Law, Weiyi Wang, Xu Chen, Yi-Chun Kuo, Yishuang Pang, Youchuan Hu, Yu-hui Chen, Zichuan Wei



Source_link

Related Posts

Google’s first Gemini smart home speaker detailed in leak
Google Marketing

Google’s first Gemini smart home speaker detailed in leak

August 23, 2025
Ruth Porat on AI and its applications in finance
Google Marketing

Ruth Porat on AI and its applications in finance

August 23, 2025
Google’s AI-stuffed Pixel 10 event
Google Marketing

Google’s AI-stuffed Pixel 10 event

August 22, 2025
New Gemini feature and model updates for Pixels, smartphones
Google Marketing

New Gemini feature and model updates for Pixels, smartphones

August 22, 2025
Google made it easier to edit your Drive videos
Google Marketing

Google made it easier to edit your Drive videos

August 22, 2025
Train a GPT2 model with JAX on TPU for free
Google Marketing

Train a GPT2 model with JAX on TPU for free

August 22, 2025
Next Post

The Ultimate Event Budget Guide (With Examples + Templates)

POPULAR NEWS

Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
7 Best EOR Platforms for Software Companies in 2025

7 Best EOR Platforms for Software Companies in 2025

June 21, 2025
Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Refreshing a Legacy Brand for a Meaningful Future – Truly Deeply – Brand Strategy & Creative Agency Melbourne

Refreshing a Legacy Brand for a Meaningful Future – Truly Deeply – Brand Strategy & Creative Agency Melbourne

June 7, 2025

EDITOR'S PICK

Bunnings retains its crown as Australia’s trusted brand. Coles and Woolies continue to struggle. – Truly Deeply – Brand Strategy & Creative Agency Melbourne

Bunnings retains its crown as Australia’s trusted brand. Coles and Woolies continue to struggle. – Truly Deeply – Brand Strategy & Creative Agency Melbourne

June 4, 2025
Google partners on watershed health in North and South Carolina

Google partners on watershed health in North and South Carolina

May 30, 2025
Why Advertisers Are Betting Big on YouTube Connected TV

Why Advertisers Are Betting Big on YouTube Connected TV

June 7, 2025
A Tutorial on Using OpenAI Codex with GitHub Repositories for Seamless AI-Powered Development

A Tutorial on Using OpenAI Codex with GitHub Repositories for Seamless AI-Powered Development

July 4, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Maximize Your Amazon Affiliate Income with Pinterest
  • OpenCUA’s open source computer-use agents rival proprietary models from OpenAI and Anthropic
  • Google AI Proposes Novel Machine Learning Algorithms for Differentially Private Partition Selection
  • Where Nostalgia Finds a New Edge
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?