• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Tuesday, June 9, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

Google Releases Gemini 3.5 Live Translate, a Streaming Speech-to-Speech Audio Model Covering 70+ Languages Across Meet, Translate, and the Live API

Josh by Josh
June 9, 2026
in Al, Analytics and Automation
0
Google Releases Gemini 3.5 Live Translate, a Streaming Speech-to-Speech Audio Model Covering 70+ Languages Across Meet, Translate, and the Live API


Google just announced Gemini 3.5 Live Translate. It is their latest audio model for live speech-to-speech translation. Speech-to-speech means spoken audio goes in, and translated spoken audio comes out. The model detects over 70 languages automatically and generates translated speech. It preserves the speaker’s intonation, pacing, and pitch in the output. Turn-by-turn systems wait for a speaker to finish before responding. Gemini 3.5 Live Translate generates speech continuously instead. It balances a trade-off between waiting for context and translating immediately. More context improves quality. Faster output keeps the translation in sync with the speaker. The result stays a few seconds behind the speaker throughout a session.

Gemini 3.5 Live Translate

Gemini 3.5 Live Translate is a single audio model (gemini-3.5-live-translate-preview), not a chat assistant. It processes speech as the audio streams in, rather than after a full sentence. It handles multilingual inputs without manually configuring settings. Its noise robustness lets applications run in loud, unpredictable environments.

The model is rolling out across three surfaces. Developers get it in public preview through the Gemini Live API and Google AI Studio. Enterprises get a private preview in Google Meet starting this month. Everyone else gets it through the Google Translate app on Android and iOS.

How the Continuous Streaming Works

The design difference matters for building real-time features. A conversational Live agent uses turn-based interactions. It relies on pauses, intent detection, and interruption handling. Live Translation uses continuous stream processing instead. It translates as the speaker talks, without waiting for turns to end.

To hold strict real-time latency thresholds, the translation path accepts audio input only. Text input is not supported in translation mode. The model also drops tool use and system instructions in this mode. That keeps it a focused translator pipeline rather than a general agent.

Building With the Live API

Developers configure translation inside the Live API session setup. You set a translationConfig block within the generationConfig. The targetLanguageCode field takes a BCP-47 code, such as "pl" or "es". BCP-47 is the standard format for language tags like en or pt-BR. It defaults to "en". The echoTargetLanguage boolean controls input that is already in the target language. When true, the model echoes that speech. When false, it stays silent. You can also enable inputAudioTranscription and outputAudioTranscription for text transcripts.

Audio formats are fixed. Input is raw 16-bit PCM at 16kHz, mono, little-endian. Output is raw 16-bit PCM at 24kHz, mono, little-endian. PCM is uncompressed raw audio. You send audio in chunks of 100ms. For client-side apps, ephemeral tokens on the v1alpha endpoint avoid exposing your API key.

Dimension Live Agent Live Translation
Model role Assistant that listens, reasons, and acts Interpreter / real-time translator pipeline
Interaction Turn-based, with interruption handling Continuous stream processing, no turns
Tools Function calling, Google Search, instructions Translation only, no tools or instructions
Inputs Text, audio, video, and image Audio only, for strict latency
Configuration Generation, speech, tools, instructions targetLanguageCode and echoTargetLanguage

Use Case

The model targets live interpretation across several settings. Google lists multilingual calls, meetings, lessons, and broadcasts. Developer platforms reduce the integration work for real-time media. Agora, Fishjam, LiveKit, Pipecat, and Vision Agents already use the Live API. These platforms handle the complex real-time media streaming infrastructure. That lets developers focus on the user experience instead.

Google’s example app demonstrates dubbing and simultaneous multi-language translation. Grab is testing the model for driver-and-traveler communication at pickups. Grab users make over 10 million voice calls per month. CJ ENM, LiveKit, and others reported positive feedback on quality, accuracy, and low latency.

How It Changes Google Meet and Translate

According to Google’s official release, Google Meet will soon use 3.5 Live Translate for speech translation. The table shows the stated before-and-after for Meet.

Capability Previous Meet With 3.5 Live Translate
Languages 5 70+
Combinations per meeting Only to and from English 2000+ combinations
Access Existing interface Updated interface for instant access

The Meet update is in private preview for select business Workspace customers this month. A broader rollout follows later this year. In the Translate app, the Live translate feature works with any connected headphones. It mirrors the speaker’s tone across 70+ languages. Android also gains a listening mode. You hold the phone to your ear like a regular call. The translated audio then streams through the earpiece, without others hearing.

Key Takeaways

  • Gemini 3.5 Live Translate is Google’s latest audio model for live speech-to-speech translation across 70+ languages.
  • It streams continuously instead of turn-by-turn, staying a few seconds behind the speaker.
  • Developers can configure it via the Live API using targetLanguageCode and echoTargetLanguage; audio-only, 16kHz in, 24kHz out.
  • It rolls out to the Gemini Live API, Google Meet (5→70+ languages), and the Translate app.
  • All generated audio carries an imperceptible SynthID watermark for detectability.

Check out the Model Card and Technical details. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us




Source_link

READ ALSO

NVIDIA cuTile Python Tutorial: Building Tiled GPU Kernels for Vector Addition, Matrix Addition, and Matrix Multiplication in Colab

ClawHub Security Signals: A Coding Guide to End-to-End Security Signal Analysis and Verdict Classification on the AI Skills Dataset

Related Posts

NVIDIA cuTile Python Tutorial: Building Tiled GPU Kernels for Vector Addition, Matrix Addition, and Matrix Multiplication in Colab
Al, Analytics and Automation

NVIDIA cuTile Python Tutorial: Building Tiled GPU Kernels for Vector Addition, Matrix Addition, and Matrix Multiplication in Colab

June 9, 2026
ClawHub Security Signals: A Coding Guide to End-to-End Security Signal Analysis and Verdict Classification on the AI Skills Dataset
Al, Analytics and Automation

ClawHub Security Signals: A Coding Guide to End-to-End Security Signal Analysis and Verdict Classification on the AI Skills Dataset

June 8, 2026
Microsoft AI Introduces MAI-Transcribe-1.5: 2.4% WER on Artificial Analysis, Best-in-Class FLEURS Accuracy, and Up to 5x Faster Long-Audio Transcription
Al, Analytics and Automation

Microsoft AI Introduces MAI-Transcribe-1.5: 2.4% WER on Artificial Analysis, Best-in-Class FLEURS Accuracy, and Up to 5x Faster Long-Audio Transcription

June 8, 2026
Building Reflective Prompt Optimization with GEPA: Multi-Component Prompts, Structured Feedback, and Held-Out Validation
Al, Analytics and Automation

Building Reflective Prompt Optimization with GEPA: Multi-Component Prompts, Structured Feedback, and Held-Out Validation

June 7, 2026
Best 21 Low-Code and No-Code AI Tools in 2026
Al, Analytics and Automation

Best 21 Low-Code and No-Code AI Tools in 2026

June 7, 2026
Tod Machover receives George Peabody Medal for contributions to music and technology | MIT News
Al, Analytics and Automation

Tod Machover receives George Peabody Medal for contributions to music and technology | MIT News

June 6, 2026
Next Post
Kalshi Will Require Employment Info For Some Bets As An Insider Trading Precaution

Kalshi Will Require Employment Info For Some Bets As An Insider Trading Precaution

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

What Really Works for Instagram Organic Reach

What Really Works for Instagram Organic Reach

July 5, 2025
Do Apple Watches and Oura Rings really make use healthier?

Do Apple Watches and Oura Rings really make use healthier?

June 29, 2025
6 Best Media Monitoring Software (My 2025 Review)

6 Best Media Monitoring Software (My 2025 Review)

August 2, 2025
New features for Galaxy XR and a look at future devices

New features for Galaxy XR and a look at future devices

December 9, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Adopt Me Script (No Key, Auto Farm Pet, Spawn Pets)
  • Kalshi Will Require Employment Info For Some Bets As An Insider Trading Precaution
  • Google Releases Gemini 3.5 Live Translate, a Streaming Speech-to-Speech Audio Model Covering 70+ Languages Across Meet, Translate, and the Live API
  • Creative Spark’s bold, no nonsense identity for hair loss brand Leo
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions