Alibaba launched Qwen3, the latest generation of its open-sourced large language model (LLM) family, on 29 April 2025.
The Qwen3 series features six dense models and two Mixture-of-Experts (MoE) models, offering developers flexibility to build next-generation applications across mobile devices, smart glasses, autonomous vehicles, robotics and beyond.
All Qwen3 models—including dense models (0.6B, 1.7B, 4B, 8B, 14B, and 32B parameters) and MoE models (30B with 3B active and 235B with 22B active)—are now open-sourced and available worldwide.
Qwen3 purportedly marks Alibaba’s debut of hybrid reasoning models, combining traditional LLM capabilities with advanced, dynamic reasoning.
Qwen3 models can apparently easily switch between two modes: thinking mode for complex tasks like math, coding, and logic, and non-thinking mode for quick, general responses.
For developers accessing Qwen3 through API, the model is said to offer granular control over thinking duration (up to 38K tokens), enabling an optimised balance between intelligent performance and compute efficiency.
The Qwen3-235B-A22B MoE model is claimed to significantly lower deployment costs too, compared to other state-of-the-art models, reinforcing Alibaba‘s commitment to accessible, high-performance AI.
The Qwen3 AI model is trained on a massive dataset of 36 trillion tokens, twice as much as its predecessor, Qwen2.5. This upgrade brings major improvements in reasoning, following instructions, using tools, and handling multilingual tasks.
There are four key capabilities of the AI model. Firstly, it has multilingual mastery, which supports 119 languages and dialects, with translation and multilingual instruction-following.
Next, it also has advanced agent integration, which natively supports the Model Context Protocol (MCP) and robust function-calling, leading open-source models in complex agent-based tasks.
Thirdly, with superior reasoning, Qwen3 apparently surpasses previous Qwen models (QwQ in thinking mode and Qwen2.5 in non-thinking mode) in mathematics, coding, and logical reasoning benchmarks.
Finally, under enhanced human alignment, the model delivers more natural creative writing, role-playing, and multi-turn dialogue experiences for more natural, engaging conversations.
With improved model architecture, larger training datasets, and better training methods, Qwen3 models claim to deliver outstanding performance on industry benchmarks like AIME25 (math reasoning), LiveCodeBench (coding skills), BFCL (tool and function-calling), and Arena-Hard (instruction-tuned LLMs).
Notably, to develop the hybrid reasoning model, a four-stage training process was used. This process includes long chain-of-thought (CoT) cold start, reasoning-based reinforcement learning (RL), thinking mode fusion, and general RL.
Qwen3 models are now freely available for download on Hugging Face, GitHub and ModelScope, and can be explored on chat.qwen.ai.
API access will soon be available through Alibaba‘s AI model development platform Model Studio. Qwen3 also powers Alibaba’s flagship AI super assistant application, Quark.
Since its debut, the Qwen model family has purportedly attracted over 300 million downloads worldwide. Developers have apparently created more than 100,000 Qwen-based derivative models on Hugging Face.
Source of image: Edited from Freepik