• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Saturday, January 24, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

Alibaba’s Qwen AI Releases Compact Dense Qwen3-VL 4B/8B (Instruct & Thinking) With FP8 Checkpoints

Josh by Josh
October 15, 2025
in Al, Analytics and Automation
0
Alibaba’s Qwen AI Releases Compact Dense Qwen3-VL 4B/8B (Instruct & Thinking) With FP8 Checkpoints
0
SHARES
2
VIEWS
Share on FacebookShare on Twitter


Do you actually need a giant VLM when dense Qwen3-VL 4B/8B (Instruct/Thinking) with FP8 runs in low VRAM yet retains 256K→1M context and the full capability surface? Alibaba’s Qwen team has expanded its multimodal lineup with dense Qwen3-VL models at 4B and 8B scales, each shipping in two task profiles—Instruct and Thinking—plus FP8-quantized checkpoints for low-VRAM deployment. The drop arrives as a smaller, edge-friendly complement to the previously released 30B (MoE) and 235B (MoE) tiers and keeps the same capability surface: image/video understanding, OCR, spatial grounding, and GUI/agent control.

https://github.com/QwenLM/Qwen3-VL/tree/main

What’s in the release?

SKUs and variants: The new additions comprise four dense models—Qwen3-VL-4B and Qwen3-VL-8B, each in Instruct and Thinking editions—alongside FP8 versions of the 4B/8B Instruct and Thinking checkpoints. The official announcement explicitly frames these as “compact, dense” models with lower VRAM usage and full Qwen3-VL capabilities retained.

Context length and capability surface: The model cards list native 256K context with expandability to 1M, and document the full feature set: long-document and video comprehension, 32-language OCR, 2D/3D spatial grounding, visual coding, and agentic GUI control on desktop and mobile. These attributes carry over to the new 4B/8B SKUs.

Architecture notes: Qwen3-VL highlights three core updates: Interleaved-MRoPE for robust positional encoding over time/width/height (long-horizon video), DeepStack for fusing multi-level ViT features and sharpening image–text alignment, and Text–Timestamp Alignment beyond T-RoPE for event localization in video. These design details appear in the new cards as well, signaling architectural continuity across sizes.

Project timeline: The Qwen3-VL GitHub “News” section records the publication of Qwen3-VL-4B (Instruct/Thinking) and Qwen3-VL-8B (Instruct/Thinking) on Oct 15, 2025, following earlier releases of the 30B MoE tier and organization-wide FP8 availability.

FP8: deployment-relevant details

Numerics and parity claim: The FP8 repositories state fine-grained FP8 quantization with block size 128, with performance metrics nearly identical to the original BF16 checkpoints. For teams evaluating precision trade-offs on multimodal stacks (vision encoders, cross-modal fusion, long-context attention), having vendor-produced FP8 weights reduces re-quantization and re-validation burden.

Tooling status: The 4B-Instruct-FP8 card notes that Transformers does not yet load these FP8 weights directly, and recommends vLLM or SGLang for serving; the card includes working launch snippets. Separately, the vLLM recipes guide recommends FP8 checkpoints for H100 memory efficiency. Together, these point to immediate, supported paths for low-VRAM inference.

Key Takeaways

  • Qwen released dense Qwen3-VL 4B and 8B models, each in Instruct and Thinking variants, with FP8 checkpoints.
  • FP8 uses fine-grained FP8 (block size 128) with near-BF16 metrics; Transformers loading is not yet supported—use vLLM/SGLang.
  • Capability surface is preserved: 256K→1M context, 32-language OCR, spatial grounding, video reasoning, and GUI/agent control.
  • Model Card-reported sizes: Qwen3-VL-4B ≈ 4.83B params; Qwen3-VL-8B-Instruct ≈ 8.77B params.

Qwen’s decision to ship dense Qwen3-VL 4B/8B in both Instruct and Thinking forms with FP8 checkpoints is the practical part of the story: lower-VRAM, deployment-ready weights (fine-grained FP8, block size 128) and explicit serving guidance (vLLM/SGLang) makes it easily deployable. The capability surface—256K context expandable to 1M, 32-language OCR, spatial grounding, video understanding, and agent control—remains intact at these smaller scales, which matters more than leaderboard rhetoric for teams targeting single-GPU or edge budgets.


Check out the Model on Hugging Face and GitHub Repo. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.


Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.

🙌 Follow MARKTECHPOST: Add us as a preferred source on Google.



Source_link

READ ALSO

Joi Chatbot Access, Pricing, and Feature Overview

Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control

Related Posts

Joi Chatbot Access, Pricing, and Feature Overview
Al, Analytics and Automation

Joi Chatbot Access, Pricing, and Feature Overview

January 23, 2026
Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control
Al, Analytics and Automation

Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control

January 23, 2026
Quality Data Annotation for Cardiovascular AI
Al, Analytics and Automation

Quality Data Annotation for Cardiovascular AI

January 23, 2026
A Missed Forecast, Frayed Nerves and a Long Trip Back
Al, Analytics and Automation

A Missed Forecast, Frayed Nerves and a Long Trip Back

January 23, 2026
Microsoft Releases VibeVoice-ASR: A Unified Speech-to-Text Model Designed to Handle 60-Minute Long-Form Audio in a Single Pass
Al, Analytics and Automation

Microsoft Releases VibeVoice-ASR: A Unified Speech-to-Text Model Designed to Handle 60-Minute Long-Form Audio in a Single Pass

January 23, 2026
Slow Down the Machines? Wall Street and Silicon Valley at Odds Over A.I.’s Nearest Future
Al, Analytics and Automation

Slow Down the Machines? Wall Street and Silicon Valley at Odds Over A.I.’s Nearest Future

January 22, 2026
Next Post
Samsung will introduce its Android XR headset at a Galaxy event on October 21

Samsung will introduce its Android XR headset at a Galaxy event on October 21

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Google announced the next step in its nuclear energy plans 

Google announced the next step in its nuclear energy plans 

August 20, 2025

EDITOR'S PICK

Creative Testing Feature Problems – Jon Loomer Digital

Creative Testing Feature Problems – Jon Loomer Digital

October 24, 2025
We Made a Decision That Lowers Our MRR and ARR, Here’s The Story

We Made a Decision That Lowers Our MRR and ARR, Here’s The Story

September 17, 2025
4 Steps to Transform Omnichannel Banking Customer Experiences

4 Steps to Transform Omnichannel Banking Customer Experiences

August 15, 2025
Building Sustainable B2B Content with Devin Reed [+Video]

Building Sustainable B2B Content with Devin Reed [+Video]

June 28, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • PR Drives Blockchain Adoption in Institutions
  • US Judge Rules ICE Raids Require Judicial Warrants, Contradicting Secret ICE Memo
  • TikTok Oracle Deal: What the New US Entity Means for Advertisers in 2026
  • If You’ve Been Investing in SEO, You’re on the Right Track With GEO: An AMA With Lily Ray
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?