• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Wednesday, February 18, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

Cohere Releases Tiny Aya: A 3B-Parameter Small Language Model that Supports 70 Languages and Runs Locally Even on a Phone

Josh by Josh
February 18, 2026
in Al, Analytics and Automation
0
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter


Cohere AI Labs has released Tiny Aya, a family of small language models (SLMs) that redefines multilingual performance. While many models scale by increasing parameters, Tiny Aya uses a 3.35B-parameter architecture to deliver state-of-the-art translation and generation across 70 languages.

The release includes 5 models: Tiny Aya Base (pretrained), Tiny Aya Global (balanced instruction-tuned), and three region-specific variants—Earth (Africa/West Asia), Fire (South Asia), and Water (Asia-Pacific/Europe).

READ ALSO

Anthropic Releases Claude 4.6 Sonnet with 1 Million Token Context to Solve Complex Coding and Search for Developers

A Beginner’s Reading List for Large Language Models for 2026

https://cohere.com/blog/cohere-labs-tiny-aya

The Architecture

Tiny Aya is built on a dense decoder-only Transformer architecture. Key specs include:

  • Parameters: 3.35B total (2.8B non-embedding)
  • Layers: 36
  • Vocabulary: 262k tokenizer designed for equitable language representation.
  • Attention: Interleaved sliding window and full attention (3:1 ratio) with Grouped Query Attention (GQA).
  • Context: 8192 tokens for input and output.

The model was pretrained on 6T tokens using a Warmup-Stable-Decay (WSD) schedule. To maintain stability, the team used SwiGLU activations and removed all biases from dense layers.

Advanced Post-training: FUSION and SimMerge

To bridge the gap in low-resource languages, Cohere used a synthetic data pipeline.

  1. Fusion-of-N (FUSION): Prompts are sent to a ‘team of teachers’ (COMMAND A, GEMMA3-27B-IT, DEEPSEEK-V3). A judge LLM, the Fusor, extracts and aggregates the strongest components of their responses.
  2. Region Specialization: Models were finetuned on 5 regional clusters (e.g., South Asia, Africa).
  3. SimMerge: To prevent ‘catastrophic forgetting’ of global safety, regional checkpoints were merged with the global model using SimMerge, which selects the best merge operators based on similarity signals.

Performance Benchmarks

Tiny Aya Global consistently beats larger or same-scale competitors in multilingual tasks:

  • Translation: It outperforms GEMMA3-4B in 46 of 61 languages on WMT24++.
  • Reasoning: In the GlobalMGSM (math) benchmark for African languages, Tiny Aya achieved 39.2% accuracy, dwarfing GEMMA3-4B (17.6%) and QWEN3-4B (6.25%).
  • Safety: It holds the highest mean safe response rate (91.1%) on MultiJail.
  • Language Integrity: The model achieves 94% language accuracy, meaning it rarely switches to English when asked to reply in another language.

On-Device Deployment

Tiny Aya is optimized for edge computing. Using 4-bit quantization (Q4_K_M), the model fits in a 2.14 GB memory footprint.

  • iPhone 13: 10 tokens/s.
  • iPhone 17 Pro: 32 tokens/s.

This quantization scheme results in a minimal 1.4-point drop in generation quality, making it a viable solution for offline, private, and localized AI applications.

Key Takeaways

  • Efficient Multilingual Power: Tiny Aya is a 3.35B-parameter model family that delivers state-of-the-art translation and high-quality generation across 70 languages. It proves that massive scale is not required for strong multilingual performance if models are designed with intentional data curation.
  • Innovative Training Pipeline: The models were developed using a novel strategy involving Fusion-of-N (FUSION), where a ‘team of teachers’ (like Command A and DeepSeek-V3) generated synthetic data. A judge model then aggregated the strongest components to ensure high-quality training signals even for low-resource languages.
  • Regional Specialization via Merging: Cohere released specialized variants—Tiny Aya Earth, Fire, and Water—which are tuned for specific regions like Africa, South Asia, and the Asia-Pacific. These were created by merging regional fine-tuned models with a global model using SimMerge to preserve safety while boosting local language performance.
  • Superior Benchmark Performance: Tiny Aya Global outperforms competitors like Gemma3-4B in translation quality for 46 of 61 languages on WMT24++. It also significantly reduces disparities in mathematical reasoning for African languages, achieving 39.2% accuracy compared to Gemma3-4B’s 17.6%.
  • Optimized for On-Device Deployment: The model is highly portable and runs efficiently on edge devices; it achieves ~10 tokens/s on an iPhone 13 and 32 tokens/s on an iPhone 17 Pro using Q4_K_M quantization. This 4-bit quantization format maintains high quality with only a minimal 1.4-point degradation.

Check out the Technical details, Paper, Model Weights and Playground. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.




Source_link

Related Posts

Al, Analytics and Automation

Anthropic Releases Claude 4.6 Sonnet with 1 Million Token Context to Solve Complex Coding and Search for Developers

February 17, 2026
A Beginner’s Reading List for Large Language Models for 2026
Al, Analytics and Automation

A Beginner’s Reading List for Large Language Models for 2026

February 17, 2026
Agoda Open Sources APIAgent to Convert Any REST pr GraphQL API into an MCP Server with Zero Code
Al, Analytics and Automation

Agoda Open Sources APIAgent to Convert Any REST pr GraphQL API into an MCP Server with Zero Code

February 17, 2026
7 Advanced Feature Engineering Tricks Using LLM Embeddings
Al, Analytics and Automation

7 Advanced Feature Engineering Tricks Using LLM Embeddings

February 17, 2026
AI TrendVid Video Generator App: Pricing Overview and Feature Breakdown
Al, Analytics and Automation

AI TrendVid Video Generator App: Pricing Overview and Feature Breakdown

February 16, 2026
Google DeepMind Proposes New Framework for Intelligent AI Delegation to Secure the Emerging Agentic Web for Future Economies
Al, Analytics and Automation

Google DeepMind Proposes New Framework for Intelligent AI Delegation to Secure the Emerging Agentic Web for Future Economies

February 16, 2026
Next Post
2026 guide to finding the perfect social media target audience

2026 guide to finding the perfect social media target audience

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Google announced the next step in its nuclear energy plans 

Google announced the next step in its nuclear energy plans 

August 20, 2025

EDITOR'S PICK

The Future of Remote Patient Monitoring Services

The Future of Remote Patient Monitoring Services

December 30, 2025

The Scoop: Kimmel suspension puts Disney in crosshairs of free speech vs. regulation debate

September 20, 2025
Google Gemini can now read your Docs aloud

Google Gemini can now read your Docs aloud

August 19, 2025
An Overview of How We Got Online

An Overview of How We Got Online

June 27, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • I Evaluated the Top 8 Online Course Providers for 2026
  • The Evolution of Demand Generation: From Lead Volume to Revenue Impact
  • Wendy’s Drive-Thru Pit Stop & Nike’s Mobile Basecamp
  • Google’s AI search results will make links more obvious
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?