• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Wednesday, April 29, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Google Marketing

T5Gemma: A new collection of encoder-decoder Gemma models

Josh by Josh
July 10, 2025
in Google Marketing
0
T5Gemma: A new collection of encoder-decoder Gemma models


In the rapidly evolving landscape of large language models (LLMs), the spotlight has largely focused on the decoder-only architecture. While these models have shown impressive capabilities across a wide range of generation tasks, the classic encoder-decoder architecture, such as T5 (The Text-to-Text Transfer Transformer), remains a popular choice for many real-world applications. Encoder-decoder models often excel at summarization, translation, QA, and more due to their high inference efficiency, design flexibility, and richer encoder representation for understanding input. Nevertheless, the powerful encoder-decoder architecture has received little relative attention.

Today, we revisit this architecture and introduce T5Gemma, a new collection of encoder-decoder LLMs developed by converting pretrained decoder-only models into the encoder-decoder architecture through a technique called adaptation. T5Gemma is based on the Gemma 2 framework, including adapted Gemma 2 2B and 9B models as well as a set of newly trained T5-sized models (Small, Base, Large and XL). We are excited to release pretrained and instruction-tuned T5Gemma models to the community to unlock new opportunities for research and development.

From decoder-only to encoder-decoder

In T5Gemma, we ask the following question: can we build top-tier encoder-decoder models based on pretrained decoder-only models? We answer this question by exploring a technique called model adaptation. The core idea is to initialize the parameters of an encoder-decoder model using the weights of an already pretrained decoder-only model, and then further adapt them via UL2 or PrefixLM-based pre-training.

decoder-only model

An overview of our approach, showing how we initialize a new encoder-decoder model using the parameters from a pretrained, decoder-only model.

This adaptation method is highly flexible, allowing for creative combinations of model sizes. For instance, we can pair a large encoder with a small decoder (e.g., a 9B encoder with a 2B decoder) to create an “unbalanced” model. This allows us to fine-tune the quality-efficiency trade-off for specific tasks, such as summarization, where a deep understanding of the input is more critical than the complexity of the generated output.

Towards better quality-efficiency trade-off

How does T5Gemma perform?

In our experiments, T5Gemma models achieve comparable or better performance than their decoder-only Gemma counterparts, nearly dominating the quality-inference efficiency pareto frontier across several benchmarks, such as SuperGLUE which measures the quality of the learned representation.

Encoder-decoder models benchmarks

Encoder-decoder models consistently offer better performance for a given level of inference compute, leading the quality-efficiency frontier across a range of benchmarks.

This performance advantage isn’t just theoretical; it translates to real-world quality and speed too. When measuring the actual latency for GSM8K (math reasoning), T5Gemma provided a clear win. For example, T5Gemma 9B-9B achieves higher accuracy than Gemma 2 9B but with a similar latency. Even more impressively, T5Gemma 9B-2B delivers a significant accuracy boost over the 2B-2B model, yet its latency is nearly identical to the much smaller Gemma 2 2B model. Ultimately, these experiments showcase that encoder-decoder adaptation offers a flexible, powerful way to balance across quality and inference speed.

Unlocking foundational and fine-tuned capabilities

Could encoder-decoder LLMs have similar capabilities to decoder-only models?

Yes, T5Gemma shows promising capabilities both before and after instruction tuning.

After pre-training, T5Gemma achieves impressive gains on complex tasks that require reasoning. For instance, T5Gemma 9B-9B scores over 9 points higher on GSM8K (math reasoning) and 4 points higher on DROP (reading comprehension) than the original Gemma 2 9B model. This pattern demonstrates that the encoder-decoder architecture, when initialized via adaptation, has the potential to create a more capable, performant foundational model.

Detailed results for pretrained models

Detailed results for pretrained models, illustrating how adapted models have significant gains on several reasoning-intensive benchmarks compared to decoder-only Gemma 2.

These foundational improvements from pre-training set the stage for even more dramatic gains after instruction tuning. For example, comparing Gemma 2 IT to T5Gemma IT, the performance gap widens significantly across the board. T5Gemma 2B-2B IT sees its MMLU score jump by nearly 12 points over the Gemma 2 2B, and its GSM8K score increases from 58.0% to 70.7%. The adapted architecture not only potentially provides a better starting point but also responds more effectively to instruction-tuning, ultimately leading to a substantially more capable and helpful final model.

Results for fine-tuned + RLHFed models

Detailed results for fine-tuned + RLHFed models, illustrating the capabilities of post-training to significantly amplify the performance advantages of the encoder-decoder architecture.

Explore our models: Releasing T5Gemma checkpoints

We’re very excited to present this new method of building powerful, general purpose encoder-decoder models by adapting from pretrained decoder-only LLMs like Gemma 2. To help accelerate further research and allow the community to build on this work, we are excited to release a suite of our T5Gemma checkpoints.

The release includes:

  • Multiple Sizes: Checkpoints for T5-sized models (Small, Base, Large, and XL), the Gemma 2-based models (2B and 9B), as well as an additional model in between T5 Large and T5 XL.
  • Multiple Variants: Pretrained and instruction-tuned models.
  • Flexible Configurations: A powerful and efficient unbalanced 9B-2B checkpoint to explore the trade-offs between encoder and decoder size.
  • Different Training Objectives: Models trained with either PrefixLM or UL2 objectives to provide either state-of-the-art generative performance or representation quality.

We hope these checkpoints will provide a valuable resource for investigating model architecture, efficiency, and performance.

Getting started with T5Gemma

We can’t wait to see what you build with T5Gemma. Please see the following links for more information:

  • Learn about the research behind this project by reading the paper.
  • Explore the models capabilities or fine-tune them for your own use cases with the Colab notebook.



Source_link

READ ALSO

General Motors is adding Gemini to four million cars

Gemini launches new personalisation features in the UK

Related Posts

General Motors is adding Gemini to four million cars
Google Marketing

General Motors is adding Gemini to four million cars

April 29, 2026
Gemini launches new personalisation features in the UK
Google Marketing

Gemini launches new personalisation features in the UK

April 29, 2026
Google Translate can now help you with pronunciation
Google Marketing

Google Translate can now help you with pronunciation

April 29, 2026
20 fun facts to celebrate Google Translate turning 20
Google Marketing

20 fun facts to celebrate Google Translate turning 20

April 29, 2026
Google’s updated Pentagon deal uses Gemini for ‘any lawful government purpose’ with classified data
Google Marketing

Google’s updated Pentagon deal uses Gemini for ‘any lawful government purpose’ with classified data

April 28, 2026
Google Home makes it easier to understand why devices aren’t working
Google Marketing

Google Home makes it easier to understand why devices aren’t working

April 28, 2026
Next Post
Digital Transformation Strategy for Australian Enterprises

Digital Transformation Strategy for Australian Enterprises

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

How Do Small Airports Make Money? 18 Proven Ways To Boost Revenue

How Do Small Airports Make Money? 18 Proven Ways To Boost Revenue

November 24, 2025
The Search Monitor Review & Alternatives: Your Brand’s Online Protection

The Search Monitor Review & Alternatives: Your Brand’s Online Protection

November 10, 2025
How to plan your 2026 budget with Gemini and Google Sheets

How to plan your 2026 budget with Gemini and Google Sheets

February 4, 2026
Google announces Gradient Canvas AI art exhibition

Google announces Gradient Canvas AI art exhibition

December 14, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Why Offerwalls Belong in Your Mobile Game Monetization Strategy April 2025 (Updated)
  • How Brookline PR Powered The Price is Right Live with The Brick – Brookline PR
  • Texas Instruments made a new flagship graphing calculator: the TI-84 Evo
  • Meta FAIR Releases NeuralSet: A Python Package for Neuro-AI That Supports fMRI, M/EEG, Spikes, and HuggingFace Embeddings
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions