• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Friday, April 24, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Google Marketing

Building real-world on-device AI with LiteRT and NPU

Josh by Josh
April 24, 2026
in Google Marketing
0
Building real-world on-device AI with LiteRT and NPU


Users benefit from instant AI features like real-time video effects, ASR, and motion capture in their mobile apps. However, for developers, running sophisticated models on-device often comes with balancing unique challenges related to managing device thermals, preserving battery life, and preventing frame drops. To deliver fast, responsive AI experiences without compromising performance, LiteRT unlocks Neural Processing Units (NPUs), the hardware specifically built for these workloads.

LiteRT is a cross-platform production-ready framework for on-device AI, offering CPU, GPU, and NPU acceleration across mobile, desktop, and IoT platforms. Designed for performance and scalability, LiteRT simplifies the deployment of high-speed AI features, through a unified API. This abstracts the complexity of integrating with multiple NPU SDKs, allowing developers to target diverse silicon without writing vendor-specific code.

Translating NPU performance into meaningful experiences

LiteRT is already hardened across Google products, popular apps, and even SDKs. Utilized by industry leaders including Google Meet, Epic Games, and Argmax Inc. here is what NPU acceleration looks like in real-world production apps.

Google Meet: By leveraging the mobile NPU, Google Meet successfully deployed an Ultra-HD segmentation model 25x larger than previous versions – without sacrificing inference speed. Crucially, it maintains a consistent power footprint, creating thermal headroom necessary to deliver higher-quality background replacement throughout a typical 20-30 min session.

Epic Games, Inc: High-fidelity, real-time animation experiences demand exceptional efficiency. Epic’s Live Link Face (Beta) app for Android enables creators to capture performances from a single camera, then generate and stream real-time MetaHuman facial animation directly from their devices into Unreal Engine.

Real-time facial solving is computationally intensive and requires consistently low latency. By using LiteRT on the NPU, Epic unlocks dedicated on-device acceleration on supported Android devices, enabling up to 30 FPS performance for real-time MetaHuman animation.

Sorry, your browser doesn’t support playback for this video

Real-time MetaHuman facial animation in Unreal Engine with NPU

Argmax Inc recently launched the Argmax Pro SDK for Android for on-device speech recognition in collaboration with LiteRT. By utilizing LiteRT and AI Pack feature delivery via Google Play, Argmax was able to bring its top-tier accuracy and real-time speed while respecting app size constraints on Android. Crucially, they leveraged LiteRT’s Ahead-Of-Time (AOT) compilation to eliminate costly on-device compilation steps, enabling frontier speech models like NVIDIA Parakeet TDT 0.6B v2 to run with industry-leading latency.

Performance testing across Google Tensor, MediaTek and Qualcomm Technologies SoCs, Argmax Pro SDK showed that upgrading from GPU to NPU delivers over 2x speedup. Beyond the speedups, the power efficiency of NPUs enabled Argmax SDK Enterprise customers like Heidi Health to conduct reliable on-device live transcription for extended sessions while mitigating impact to battery life. Finally, by offloading runtime libraries and models to on-demand downloads via Play’s AI Packs, the device dynamically obtains the model that’s optimized for the specific NPU.

Untitled

Argmax’s Kotlin-first SDK brings top-tier accuracy and real-time speed to Android, with seamless NPU and GPU acceleration by Google LiteRT.

Google AI Edge Gallery App: To help developers test and validate the performance of NPU acceleration, we are happy to announce that the Google AI Edge Gallery App now features NPU support for select Gemma models and built-in benchmarking tools. Available on Android, AI Edge Gallery lets you quickly see the true potential of AI performance on mobile hardware. Developers can also access the Google AI Edge Gallery on GitHub to build their own experiences.

Sorry, your browser doesn’t support playback for this video

Explore various on-device LLM use cases with Google AI Edge Gallery

Scaling performance across the hardware spectrum

While the performance gains in speech, animation, and video are clear, the path to the NPU has historically been difficult to unlock for developers, due to various vendor-specific SDKs and complexities. By providing a streamlined workflow and cross platform support, LiteRT enables developers to deploy advanced models, from mobile phones to industrial IoT and AI PCs, without sacrificing performance or portability.

Cross-platform NPU support

As highlighted in the recent Google AI Edge Gemma 4 blog post, LiteRT extends NPU acceleration beyond mobile, allowing you to deploy your models across a range of hardware using a single framework. For the industrial edge, LiteRT supports platforms like the Qualcomm Dragonwing ™ IQ8 Series, which also powers Arduino VENTUNO Q, enabling high-reliability use cases like robotics and smart manufacturing with models like Gemma 4. For desktop, LiteRT is preparing for AI PCs through OpenVINO™ integration with Intel® Core™ Ultra series 2 and 3 processors, delivering significant power savings and responsiveness for local GenAI workloads.

Performance validation at scale

Google AI Edge Portal provides a benchmark service across 100+ of the most popular mobile phones with insights on ML workloads across devices, accelerators and configurations. Developers can now make data-driven deployment decisions, such as whether to use AOT or JIT, that best suit their use cases and their target devices. To use the latest Portal NPU features, sign up for our private preview here.

Sorry, your browser doesn’t support playback for this video

Google AI Edge Portal Benchmarking Results

Get started with your NPU journey

With our production-ready NPU integrations, LiteRT provides a unified workflow that abstracts away low-level complexities across both Just-In-Time (JIT) and Ahead-Of-Time (AOT) deployment.

Dive into our documentation and start your journey with NPU acceleration today.

Let us know your feedback and feature requests by opening an issue on our GitHub channel. We can’t wait to see what you build!

Acknowledgements

Google: Akshat Sharma, Alice Zheng, Andrew Zhang, Ashley Lin, Byungchul Kim, Changming Sun, Charlie Xu, Chenchen Tang, Chunlei Niu, Cormac Brick, Derek Bekebrede, Fabian Bergmark, Fengwu Yao, Gerardo Carranza, Gregory Karpiak, Jae Yoo, Jing Jin, Jingjiang Li, Julius Kammerl, Jun Jiang, Lu Wang, Maria Lyubimtseva, Mariana Quesada, Marissa Ikonomidis, Matt Kreileder, Matthias Grundmann, Meghna Johar, Na Li, Ping Yu, Renjie Wu, Rishika Sinha, Sachin Kotwani, Salil Tambe, Siargey Pisarchyk, Siargey Pisarchyk, Somdatta Banerjee, Steven Toribio, Suleman Shahid, Terry Heo, Wai Hon Law, Weiyi Wang, Xiaoming Hu

Partners: Alen Huang, Ankit Kapoor, Arda Atahan Ibis, Atila Orhon, Brian Keene, Chen Cen, Cheng-Dao Lee. Cheng-Yen Lin, Chun-Ting Lin (Graham), Code Lin, Deep Yap, Dylan Angus, Felix Baum, HungChun Liu, Jhih-Kuan Lin, Jiun-Kai Yang (Kelvin), Kedar Gharat, Ken Sieger, Laxmi Rayapudi, Lei Chen, Mike Tremaine, Ming-Che Lin (Vincent), Poyuan Jeng, MetaHuman Team, Vinesh Sukumar, Waimun Wong, Yi-Ru Chen, Yu-Ting Wan, Zach Nagengast



Source_link

READ ALSO

YouTube Creator Partnerships brings creators to your marketing

3 creative lessons from Google’s Flow Sessions artists

Related Posts

YouTube Creator Partnerships brings creators to your marketing
Google Marketing

YouTube Creator Partnerships brings creators to your marketing

April 23, 2026
3 creative lessons from Google’s Flow Sessions artists
Google Marketing

3 creative lessons from Google’s Flow Sessions artists

April 23, 2026
10 industry leaders building the agentic enterprise with Google Cloud
Google Marketing

10 industry leaders building the agentic enterprise with Google Cloud

April 23, 2026
Google Meet will take AI notes for in-person meetings too
Google Marketing

Google Meet will take AI notes for in-person meetings too

April 23, 2026
Agents CLI in Agent Platform: create to production in one CLI
Google Marketing

Agents CLI in Agent Platform: create to production in one CLI

April 22, 2026
Gemini Embedding 2 is now generally available
Google Marketing

Gemini Embedding 2 is now generally available

April 22, 2026
Next Post
Expedia Turns its Ambassador Summit into a Creator-first Oasis

Expedia Turns its Ambassador Summit into a Creator-first Oasis

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

Website Maintenance Services

The Grow Stage

July 9, 2025
Google AI Introduces Gemini 2.5 Flash Image: A New Model that Allows You to Generate and Edit Images by Simply Describing Them

Google AI Introduces Gemini 2.5 Flash Image: A New Model that Allows You to Generate and Edit Images by Simply Describing Them

August 26, 2025
How to Avoid Overpaying in 2026

How to Avoid Overpaying in 2026

April 10, 2026
10 Best CRM for Nonprofits on G2: My Go-to Picks

10 Best CRM for Nonprofits on G2: My Go-to Picks

April 16, 2026

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • 13 Local Citation Building Services You Can Trust
  • Bob Iger rejoins Thrive Capital as advisor after Disney exit
  • Mend Releases AI Security Governance Framework: Covering Asset Inventory, Risk Tiering, AI Supply Chain Security, and Maturity Model
  • What are the Best IT Alerting Software for Enterprises?
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions