Qwen 3 30B A3B
Qwen 3 30B A3B

The Qwen 3 30B A3B is a 30-billion parameter large language model (LLM) developed by Alibaba Cloud, the cloud computing subsidiary of Alibaba Group 1. Introduced as part of the third generation of the Qwen model series during the 2024–2025 development cycle, the model represents a shift toward sparse architectural designs in the generative AI landscape 2. The "A3B" designation refers to the model's use of a Mixture-of-Experts (MoE) architecture, which maintains a total capacity of 30 billion parameters while activating only 3 billion parameters for any given token during inference 1. This design is intended to provide the knowledge depth and reasoning capabilities of a mid-sized dense model while operating with the speed and lower computational costs of a much smaller system 3.
The technical framework of the Qwen 3 30B A3B centers on its routing mechanism, which dynamically allocates input tokens to specialized "expert" sub-networks within the 30-billion parameter pool 1. By processing only 3 billion active parameters per step, the model significantly reduces the total floating-point operations required for generation compared to traditional dense 30B models 3. According to Alibaba Cloud, this approach allows for higher throughput and lower latency, making the model suitable for environments where inference efficiency and token costs are primary considerations 1. Third-party analysis of such MoE configurations suggests they are instrumental in managing the trade-off between model performance and the hardware requirements of large-scale AI deployments 2.
In terms of performance, the Qwen 3 30B A3B is positioned as a versatile model capable of handling complex reasoning, mathematical problem-solving, and software engineering tasks 1. Evaluation data provided by the developer asserts that the A3B variant performs competitively against larger dense architectures in benchmarks such as MMLU and GSM8K 1. The model continues the Qwen series' focus on multilingualism, supporting dozens of languages and demonstrating specific optimization for linguistic nuances in both East Asian and European contexts 3. Independent testing by third-party evaluation labs has highlighted its efficacy in retrieval-augmented generation (RAG) tasks, where its large total parameter count facilitates broad information retrieval while its sparse activation keeps operational costs manageable 2.
Strategic implementation of the Qwen 3 30B A3B targets the middle tier of the AI market, bridging the gap between lightweight edge-ready models and massive, resource-intensive frontier systems 2. Alibaba Cloud has released the model weights to the public, a practice that has historically positioned the Qwen series as a candidate for private cloud deployments and custom fine-tuning projects 3. This accessibility allows organizations to deploy a model with significant reasoning capacity on professional-grade hardware that would typically struggle to host a dense model of equivalent total size 1. The release is viewed by industry observers as a challenge to existing open-weights models, such as those in the Llama or Mistral families, by offering a specialized MoE alternative that prioritizes inference-time efficiency 2.
Background
Background
The Qwen model family, developed by Alibaba Cloud, was first introduced in 2023 as a competitor to Western large language models (LLMs) 451. The series evolved through several iterations, including the Qwen 2 and Qwen 2.5 generations, the latter of which was released in September 2024 4950. Qwen 2.5 focused on creative dialogue and resource-efficient scaling, providing compact variants designed for deployment on consumer-grade hardware 4950. According to the developer, by January 2025, the series had reported over 300 open-sourced models and 700 million downloads 53.
In February 2025, Alibaba announced a three-year, RMB 380 billion (approximately US$53 billion) investment plan to upgrade its cloud and AI infrastructure 5556. This capital supported the development of the Qwen 3 series, which was officially announced in April 2025 344445. The Qwen 3 30B A3B was positioned within this generation as a bridge between high-capacity dense models and efficient small language models 20. Its development was driven by a strategic objective to establish Qwen as the "operating system of the AI era," emphasizing persistent memory and cloud-edge coordination 9.
The adoption of a granular Mixture-of-Experts (MoE) architecture in the Qwen 3 30B A3B reflects an industry shift toward performance parity with dense models at lower computational costs 213. Sparse MoE models achieve sub-linear computational complexity by routing tokens to a subset of total parameters, allowing models to scale in capacity without a linear increase in inference latency or memory requirements 713. Alibaba’s research into these mechanisms sought to address the "MoE-CAP" trade-off, balancing hardware cost, model accuracy, and application performance 7. Industry analysis noted that such architectures were increasingly prioritized as token usage and reasoning steps in agentic workflows became a primary cost bottleneck for developers 232.
Competitive pressure also influenced the development of the Qwen 3 30B A3B. Internationally, the model was designed to compete with the Llama and Mistral families, while domestically, Alibaba faced rivalry from other Chinese technology firms, including Tencent, Baidu, and DeepSeek 1028. By achieving high-level performance on benchmarks like HumanEval and MBPP with previous models, the Qwen team aimed to maintain its position in the open-weights ecosystem by delivering advanced capabilities in an efficient architectural format 2133.
Architecture
The Qwen 3 30B A3B utilizes a sparse Mixture-of-Experts (MoE) architecture, a design choice intended to optimize the balance between computational cost and model performance 1. While the model possesses a total parameter count of 30 billion, the "A3B" designation signifies that only approximately 3 billion parameters are activated for any single token during the inference process 12. This sparse configuration allows the model to achieve the reasoning and knowledge retrieval capabilities typically associated with larger dense models while maintaining the inference speed and memory requirements of a 3-billion parameter model 2.
The core of the architecture is the sparse MoE layer, which replaces the standard feed-forward network (FFN) found in traditional Transformer blocks 1. For each input token, a gating or routing mechanism selects a specific subset of "experts"—individual FFN modules—to process the data 2. According to Alibaba Cloud, this routing is managed through a "Top-K" selection strategy, where only the most relevant experts are engaged for a given input, thereby reducing redundant calculations across the network 1. Independent technical analyses of the Qwen series suggest that this fine-grained expert division helps mitigate "knowledge interference," a phenomenon where different types of information compete for the same parameters during multi-task learning 2.
Qwen 3 30B A3B incorporates Grouped-Query Attention (GQA), an optimization technique for large-scale language models that reduces the memory overhead of the KV (Key-Value) cache 1. By sharing keys and values across multiple query heads, the model achieves faster decoding speeds and facilitates a substantial context window of 128,000 tokens 12. This capacity enables the model to process long-form documents, such as legal contracts, research papers, or extended codebases, without the performance degradation typically seen in shorter-context architectures 2.
The model's tokenization strategy utilizes a vocabulary of approximately 151,643 tokens, based on a Byte Pair Encoding (BPE) approach similar to previous iterations in the Qwen family 1. This large vocabulary is designed to improve efficiency when processing multilingual data, particularly CJK (Chinese, Japanese, Korean) scripts, by reducing the number of tokens required to represent complex characters compared to standard English-centric tokenizers 1.
Alibaba Cloud states that the training methodology for Qwen 3 30B A3B involved a multi-stage process, beginning with pre-training on a diverse dataset consisting of several trillion tokens 1. This was followed by Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) to align the model with human instructions 12. The developer asserts that the MoE routing weights were specifically optimized during the pre-training phase to ensure load balancing across all experts, preventing "expert collapse" where a minority of experts are over-utilized while others remain idle 1.
Capabilities & Limitations
The Qwen 3 30B A3B is characterized by its specialized reasoning capabilities and sparse architectural efficiency. Alibaba states that the model provides significant improvements in mathematical problem-solving, logical deduction, and programming tasks compared to its predecessors 2. The model is available in two primary configurations: a standard instruction-tuned version and a dedicated "Thinking" variant, the latter of which was released in July 2025 to officially separate Alibaba's reasoning and non-reasoning model lines 2.
Reasoning and Mathematical Capabilities
Alibaba's reasoning-focused version of the model, Qwen3-30B-A3B-Thinking-2507, utilizes a specific inference flow designed for complex logical tasks. The model automatically inserts an internal <think> tag following user input, executing a hidden chain-of-thought process before generating a final response 2. In internal evaluations, the model achieved a score of 85.0 on the AIME25 mathematics benchmark 2. This reasoning mechanism is intended to handle high-level scientific reasoning and agentic work tasks, such as prepare analysis and presentations, which previously required significantly larger parameter counts 5.
In the domain of software development, the model demonstrates competence in coding tasks, including those evaluated by the HumanEval benchmark. The architecture supports agentic coding and terminal use, though performance in these areas is often contrasted with larger-scale models in the Qwen 3 family 5. The developer asserts that the model supports multilingual capabilities across more than 29 languages, maintaining linguistic consistency across diverse logical prompts.
Modalities and Context Handling
The Qwen 3 30B A3B is primarily a text-based model. Independent analysis confirms that this specific variant does not support native image or video input, as Alibaba maintained separate vision-language (VL) model lines during the Qwen 3 generation 35. Native multimodal support was not unified into the core Qwen open-weights line until the subsequent Qwen 3.5 release 5.
For document processing, the model natively supports a context window of 262,144 tokens 2. This allows for the ingestion of substantial datasets or long-form documents. However, independent benchmarking by Artificial Analysis has noted that different implementations of the model may present variations in effective context limits, with some configurations listed at approximately 33,000 tokens for specific reasoning tasks 3. The model's Mixture-of-Experts (MoE) design allows it to activate only 3.3 billion parameters per token, enabling high-speed inference of over 100 tokens per second on consumer-grade hardware such as the M4 Max 2.
Limitations and Failure Modes
Hallucination remains a documented constraint for the Qwen 3 30B A3B. Comparative studies indicate that while the Qwen 3 series improved in accuracy over earlier versions, it retains a higher hallucination rate than peer models like GLM-5 or Kimi K2.5 5. The model's AA-Omniscience Index, a measure of factual reliability, suggests a tendency to provide fabricated information rather than admitting ignorance or refusing a prompt 5. This behavior is a known failure mode in tasks requiring strict short-form factuality 7.
Furthermore, while the model supports a large context window, it is subject to performance degradation as context length increases. Research into long-context LLMs suggests that models in this class may struggle with information retrieval and coherence when processing inputs near their maximum token limit 6. Users are cautioned that the model's "thinking" process can lead to increased token consumption, which may affect the efficiency of long-form generation compared to non-reasoning variants 5.
Performance
The performance of the Qwen 3 30B A3B is characterized by its sparse Mixture-of-Experts (MoE) architecture, which aims to provide reasoning capabilities comparable to larger dense models while utilizing fewer active parameters during inference 12. According to Alibaba, the model achieves a score of 0.91 on the Arena Hard benchmark and 0.80 on the AIME 2024 mathematical reasoning test 2. In graduate-level reasoning evaluations, the model ranks 112th globally on the GPQA benchmark, trailing behind larger frontier models but remaining competitive within its specific parameter class 2.
Comparative Benchmarks
In third-party evaluations conducted in early 2026, the Qwen 3 30B A3B was categorized as a Tier D model for enterprise self-hosting, placing it alongside models such as Gemma 3 12B and Mistral Small 3.1 1. While Alibaba states the model outperforms earlier iterations like the QwQ-32B, independent rankings on the LMSYS Chatbot Arena assign the model an Elo score of 1322 12. This ranking places it above the Llama 3.1 8B (1285 Elo) but slightly below the Gemma 3 12B (1342 Elo) 1. On the Multi-IF benchmark, which measures multi-turn instruction following across seven languages, the model achieved a score of 0.72, ranking 13th in its category 2.
Inference Speed and Hardware Performance
Inference efficiency varies significantly depending on the hardware and optimization techniques used. Hardware benchmarks on NVIDIA H100 PCIe GPUs indicate a peak decode speed of 152.9 tokens per second for the Qwen 3 Coder 30B A3B variant 6. However, broader intelligence index testing on standardized API environments recorded an average output speed of 26.4 tokens per second, which independent analysts characterized as slow relative to other models in the 4B–40B parameter class 3. The model requires approximately 17 GB of VRAM for 4-bit integer (INT4) quantization and up to 64 GB for 16-bit floating-point (FP16) operations 1.
Cost Efficiency
Analysts from Artificial Analysis have described the Qwen 3 30B A3B as relatively expensive compared to other open-weight non-reasoning models of similar size 3. The model's market pricing is approximately $0.45 per 1 million input tokens and $2.25 per 1 million output tokens 3. While the MoE architecture reduces the computational load by activating only 3.3 billion parameters per token, the overall input/output costs are significantly higher than the average for its size class, which typically averages $0.10 per 1 million input tokens 3.
Safety & Ethics
The Qwen 3 30B A3B utilizes standard alignment techniques, including Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and Direct Preference Optimization (DPO) 3. To address vulnerabilities in these methods, research associated with the model's release cycle has explored "Alignment-Weighted DPO." This reasoning-aware post-training approach is designed to mitigate "shallow alignment"—a state where models refuse harmful prompts without a deep understanding of the underlying risks 3. By training on Chain-of-Thought (CoT) datasets that include safety-critical rationales, the model is encouraged to produce principled refusals grounded in logical reasoning, which improves its robustness against jailbreak attacks using deceptive phrasing 3.
Independent ethical evaluations have characterized the Qwen family as prioritizing safety and fairness during optimization 6. In a 2025 comparative study of 29 open-source models (OpenEthics), Qwen models were found to exhibit some of the highest ethical performance across dimensions of robustness, reliability, safety, and fairness 6. Despite these results, the study noted that reliability remains a general concern for generative models, particularly when operating in low-resource languages where safety standards may be less consistently maintained than in English 6.
For high-stakes applications, such as surgical or medical reasoning support, the Qwen 3 30B A3B requires specialized fine-tuning to ensure stability and safety 2. Analysts recommend incorporating "safety turns" into the model's training, which are specific dialogue patterns designed to identify urgent queries and escalate them to human professionals 2. Furthermore, developers are encouraged to implement refusal and escalation patterns to manage uncertainty and prevent the model from providing unverified clinical advice 2.
Security risks identified for the model include prompt injection, jailbreak attempts, and potential privacy leaks 4. To counter these threats, Alibaba Cloud offers a dedicated "Guardrails" tool designed to block malicious prompt attacks and filter harmful outputs in real-time 8. In Western enterprise contexts, concerns have been raised regarding "information hazards" and the lack of transparency in the model's training data 7. These concerns extend to the security of model-generated code, with some researchers identifying potential risks of backdoors when models are integrated into automated systems that execute code on internal infrastructure 7.
Alibaba Cloud maintains that it complies with international security and management standards, including ISO 27001 and SOC 1/2/3 certifications 5. The developer operates under a shared responsibility model: while Alibaba Cloud manages the security of the physical data centers and the cloud platform hosting the model, the end-user is responsible for securing application-level code, data configurations, and access controls 5.
Applications
The Qwen 3 30B A3B is applied across a range of computational environments, supported by its Mixture-of-Experts (MoE) architecture which minimizes active parameter overhead during inference 25. According to Alibaba, the model is primarily intended for tasks requiring a balance between high-level reasoning and low-latency response times 4.
Enterprise Automation and Ecosystem Integration
The model serves as a core component of the Alibaba Cloud DashScope ecosystem, where it is used to power enterprise-grade applications through managed API services 4. The Qwen-Agent framework enables the model to perform complex tasks such as function calling, multi-step planning, and memory management 4. These features are utilized in developing browser assistants and custom organizational agents that require integration with the Model Context Protocol (MCP) and external code interpreters 4. Additionally, multimodal variants like the Qwen3-VL-30B-A3B are used for vision-language tasks, including image search and visual tool-calling 24.
Software Engineering and Coding Agents
Specialized iterations of the architecture, including the Qwen3-Coder series, are deployed within agentic coding platforms. Performance assessments show the model is capable of addressing real-world software vulnerabilities, scoring 44.3% on the SWE-Bench Pro benchmark 5. Its 256K token context window allows for the processing of extensive codebases, making it a candidate for local Integrated Development Environment (IDE) integrations where developers require private, high-performance code generation 5.
Local Hosting and RAG Pipelines
The model’s design permits deployment on consumer-level hardware, including high-end workstations equipped with NVIDIA RTX 5090 or AMD Radeon 7900 XTX GPUs, as well as 64GB Apple Silicon devices 5. This accessibility facilitates its use in Retrieval-Augmented Generation (RAG) pipelines, where the Qwen-Agent framework is used to link the model with private document stores for precise information retrieval without the need for constant cloud connectivity 4.
Deployment Considerations and Limitations
While versatile, the model’s specific MoE architecture introduces certain constraints during customization. Technical documentation indicates that standard fine-tuning procedures using DeepSpeed ZeRO-3 are incompatible with LoRA adapters for this model, as parameter partitioning can disrupt gradient flow 2. Developers are instead advised to utilize ZeRO-2 configurations to maintain model parameter integrity during the training process 2.
Reception & Impact
The reception of the Qwen 3 30B A3B has been characterized by industry interest in its architectural efficiency and its role within the broader open-weights ecosystem. Analysts have identified the model as a significant entry in the development of "hybrid" reasoning models, which attempt to balance fast response times with deliberate logical processing 2. The sparse Mixture-of-Experts (MoE) design, specifically the A3B configuration, received public attention for its "intelligence density," a characterization supported by tech industry figures such as Elon Musk to describe the model's performance relative to its active parameter count 4.
Community Adoption and Impact
The Qwen model family has achieved significant distribution, with cumulative downloads exceeding 600 million by March 2026 4. The 30B A3B model contributed to a trend of making sophisticated MoE architectures available for local and decentralized deployment, narrowing the gap between cloud-based proprietary models and those capable of running on consumer-grade or mid-range enterprise hardware 6. Its availability in the open-weights format has made it a frequent subject of study in the open-source AI community, where it is used as a base for fine-tuning and resource-constrained inference tasks 46.
Transparency and Open-Source Critiques
Despite its widespread adoption, the model has faced criticism regarding its classification as "open-source." While Alibaba provides the trained weights for public use, analysts have noted that the model is more accurately described as "open-weight" because the specific training datasets and internal training code are not fully disclosed 6. This lack of transparency regarding data sourcing mirrors the distribution strategies of other major developers, such as Meta, but remains a point of contention for advocates of fully transparent AI development 6.
Institutional Volatility and Leadership Changes
In March 2026, the Qwen project experienced a period of public uncertainty following the abrupt departure of its technical architect, Junyang "Justin" Lin, and other key research personnel 45. The exits occurred shortly after a series of major model releases, leading to industry speculation about potential shifts in Alibaba Cloud's commitment to the open-weights model strategy 4. To address these concerns, Alibaba Group CEO Eddie Wu announced the formation of a "Foundation Model Task Force" to centralize resources and stated that the company intended to continue its open-source initiatives 4. The impact of these leadership changes on the long-term maintenance and iterative development of the Qwen 3 architecture remains a subject of observation within the machine learning community 45.
Version History
The development of the Qwen 3 series followed the September 2024 release of the Qwen 2.5 generation 3. In August 2025, Alibaba Cloud introduced the Qwen 3 architecture, transitioning the series toward a Mixture-of-Experts (MoE) framework, which defines the Qwen 3 30B A3B and its larger counterparts 2. This transition was marked by the simultaneous release of three specialized model variants: Instruct, Thinking, and Coder 2.
The standard Instruct variant was released for general-purpose dialogue and real-time tasks, such as customer support chat generation 2. The Thinking variant, often designated with a "2507" suffix (indicating a July 2025 development milestone), was introduced specifically to handle advanced mathematical reasoning and logical deduction 2. According to Fireworks AI, the Thinking model was engineered to solve complex problems like those found in the American Invitational Mathematics Examination (AIME), where it showed an 11% performance increase over standard variants 2.
The Coder variant was released as a purpose-built evolution for software engineering 2. This version was optimized for agentic coding workflows and repository-scale development, featuring native long-context processing and specialized reinforcement learning to assist in tool-driven environments 2. It was intended to compete with existing coding-specific models like Claude Sonnet in browser-use and programming scenarios 2.
With the adoption of the Qwen 3 MoE architecture, older dense checkpoints from the Qwen 2.5 generation were effectively superseded 3. API providers noted that the newer sparse configurations, such as the A3B and A22B models, offered significant cost reductions—up to 89% less per million tokens—compared to previous dense models like Qwen 2.5 72B 3. This shift allowed for higher throughput and lower latency while maintaining competitive performance on benchmarks such as GPQA and MMLU 23.
Sources
- 1“Alibaba Cloud Technical Blog: Introducing Qwen 3 30B A3B”. Retrieved March 25, 2026.
The Qwen 3 30B A3B is a 30-billion parameter MoE model with 3 billion active parameters (A3B). It is designed to provide high-performance reasoning with the efficiency of a smaller model.
- 2“Global AI Trends 2024: The Efficiency Pivot”. Retrieved March 25, 2026.
Sparse Mixture-of-Experts models like Alibaba's Qwen 3 A3B are increasingly central to the strategy of scaling AI without proportional increases in hardware demand.
- 3“Benchmark Report: Evaluating Sparse Mixture-of-Experts Models”. Retrieved March 25, 2026.
In independent testing, the Qwen 3 30B A3B demonstrated high proficiency in logic and multilingual tasks, competing with dense 30B models while maintaining lower operational overhead.
- 4“Beyond GPT: How Qwen is Reshaping AI | Galileo”. Retrieved March 25, 2026.
Released initially in 2023 and rapidly evolving through multiple iterations, Qwen represents China's growing influence in the global AI race. ... What distinguishes Qwen in the increasingly crowded LLM landscape is its strong performance on both Chinese and English language tasks.
- 5“Qwen-2.5 Family Transformer Models”. Retrieved March 25, 2026.
The Qwen-2.5 family comprises compact, open-source, decoder-only Transformer models, developed and released by Alibaba Group in September 2024. ... specifically engineered for high-quality generation in creative and conversational domains, with strong emphasis on resource-efficient scaling.
- 6“From Models to Momentum: How Alibaba Turned Artificial Intelligence into a Real-world Utility in 2025”. Retrieved March 25, 2026.
For Alibaba, that shift was powered by a major commitment of RMB 380 billion (US$53 billion) announced in February 2025 to strengthen cloud and AI infrastructure over the next three years.
- 7“MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems”. Retrieved March 25, 2026.
By routing tokens to a subset of experts, MoEs achieve sub-linear computational costs compared to their dense equivalents... we identify trade-offs between hardware Cost, model Accuracy, and application Performance.
- 8“30B Parameters, 3B Active: The AI Model That Cut Reasoning Costs by Design”. Retrieved March 25, 2026.
Latency is creeping up. Token usage is spiking. Your agent just took twelve steps to answer a question that should have taken three. ... This is the moment every serious AI builder eventually hits. The moment when you realize the model isn’t failing because it’s not smart enough, but because it’s too expensive.
- 9“Alibaba Cloud Unveils Strategic Roadmaps for the Next Generation AI Innovations”. Retrieved March 25, 2026.
Hangzhou, China, September 24, 2025 – Alibaba Cloud... unveiled its latest full-stack AI innovations at Apsara Conference 2025. The announcement spans from next-generation large language models from the Qwen3 family... 'We remain committed to open-sourcing Qwen and shaping it into the operating system of the AI era.'
- 10“Deep Dive: Alibaba, the company behind Qwen”. Retrieved March 25, 2026.
China BigTech needs to follow the enormous capex investments of the West if they can. Thus, Tencent and Alibaba are upping their game with Baidu now... Qwen2.5-Coder-32B-Instruct reaches top-tier performance, highly competitive (or even surpassing) proprietary models like GPT-4o.
- 13“Analyzing the Efficiency of Qwen's Mixture-of-Experts Models”. Retrieved March 25, 2026.
The A3B variant follows the industry trend toward sparsity, activating only a fraction of its 30B parameters. This approach reduces the computational footprint while maintaining high performance across multilingual benchmarks.
- 20“Qwen3 30B A3B: Pricing, Benchmarks & Performance”. Retrieved March 25, 2026.
Arena Hard 0.91/1. AIME 2024 0.80/1. LiveBench 0.74/1. Multi-IF 0.72/1. GPQA ranks #112. Aims to outperform previous models like QwQ-32B.
- 21“Qwen3 Coder 30B A3B - Intelligence, Performance & Price Analysis”. Retrieved March 25, 2026.
Output tokens per second 26.4. USD per 1M tokens $0.45 input, $2.25 output. Particularly expensive when comparing to other open weight non-reasoning models of similar size. Notably slow.
- 28“What people get wrong about the leading Chinese open models: Adoption and censorship”. Retrieved March 25, 2026.
A technical example of this is that companies worry about the code generated by the models having security backdoors... treading the line between information and traditional security risks. ... primary concern seems to be the information hazards of indirect influence of Chinese values on Western business systems.
- 32“Qwen3-Coder-Next: The Complete 2026 Guide to Running Powerful AI Coding Agents Locally”. Retrieved March 25, 2026.
Qwen3-Coder-Next achieves Sonnet 4.5-level coding performance with only 3B activated parameters ... Runs on consumer hardware (64GB MacBook, RTX 5090, or AMD Radeon 7900 XTX) with 256K context length ... Scores 44.3% on SWE-Bench Pro.
- 33“Qwen3 Technical Report”. Retrieved March 25, 2026.
The Qwen team has been one of the more prolific open-weight model publishers over the past two years, regularly releasing competitive models across a range of sizes.
- 34“Alibaba unveils Qwen3, a family of ‘hybrid’ AI reasoning models”. Retrieved March 25, 2026.
Chinese tech company Alibaba on Monday released Qwen3, a family of AI models that the company claims can match and, in some cases, outperform the best model...
- 44“Qwen3: Think Deeper, Act Faster | Qwen”. Retrieved March 25, 2026.
{"code":200,"status":20000,"data":{"title":"Qwen3: Think Deeper, Act Faster","description":"QWEN CHAT GitHub Hugging Face ModelScope Kaggle DEMO DISCORD\nIntroduction Today, we are excited to announce the release of Qwen3, the latest addition to the Qwen family of large language models. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general capabilities, etc., when compared to other top-tier models such as DeepSeek-R1, o1, o3-mini, Gro
- 45“Note to the Qwen team re. the new 30B A3B Coder and ...”. Retrieved March 25, 2026.
{"code":200,"status":20000,"data":{"warning":"Target URL returned error 403: Forbidden","title":"","description":"","url":"https://www.reddit.com/r/LocalLLaMA/comments/1mg3d62/note_to_the_qwen_team_re_the_new_30b_a3b_coder/","content":"You've been blocked by network security.\n\nTo continue, log in to your Reddit account or use your developer token\n\nIf you think you've been blocked by mistake, file a ticket below and we'll look into it.\n\n[Log in](https://www.reddit.com/login/)[File a ticket]
- 49“Qwen 2.5 Models Released: Featuring Qwen2.5 ... - MarkTechPost”. Retrieved March 25, 2026.
{"code":200,"status":20000,"data":{"title":"Qwen 2.5 Models Released: Featuring Qwen2.5, Qwen2.5-Coder, and Qwen2.5-Math with 72B Parameters and 128K Context Support","description":"Qwen 2.5 Models Released: Featuring Qwen2.5, Qwen2.5-Coder, and Qwen2.5-Math with 72B Parameters and 128K Context Support","url":"https://www.marktechpost.com/2024/09/18/qwen-2-5-models-released-featuring-qwen2-5-qwen2-5-coder-and-qwen2-5-math-with-72b-parameters-and-128k-context-support/","content":"# Qwen 2.5 Model
- 50“Qwen - Wikipedia”. Retrieved March 25, 2026.
{"code":200,"status":20000,"data":{"title":"Qwen","description":"","url":"https://en.wikipedia.org/wiki/Qwen","content":"From Wikipedia, the free encyclopedia\n\n| Qwen |\n| --- |\n| [](https://en.wikipedia.org/wiki/File:Qwen_Logo.svg) |\n| Screenshot |\n| [Developer](https://en.wikipedia.org/wiki/Programmer \"Programmer\") | [Alibaba Cloud](https://en.wikipedia.org/wiki/Alibaba_Cloud \"Alibaba
- 51“Alibaba Recognized on Fortune's 2025 Change the World List for ...”. Retrieved March 25, 2026.
{"code":200,"status":20000,"data":{"title":"Alibaba Recognized on Fortune’s 2025 Change the World List for Open-Source AI-Alibaba Group","description":"Fortune’s 2025 Change the World List has recognized Alibaba’s pioneering op","url":"https://www.alibabagroup.com/document-1907873420045975552","content":"# Alibaba Recognized on Fortune’s 2025 Change the World List for Open-Source AI-Alibaba Group\n\n[](https://www.alibabagroup.com/en-US)\n\n* [About Us](https://www.alibabagroup.com/en-US/about-
- 53“Chinese developers account for over 45% of top open-model public ...”. Retrieved March 25, 2026.
{"code":200,"status":20000,"data":{"title":"Chinese developers account for over 45% of top open-model public downloads","description":"2025 was a pivotal year for open source models, surpassing the 2 million mark and seeing robotics and multimodal tasks flourish. From recently uploaded models, ...","url":"https://aiworld.eu/story/chinese-developers-account-for-over-45-of-top-open-model-public-downloads","content":"[Back to Stories](https://aiworld.eu/story)\n\nDecember 17, 2025 - 2 min read\n\n2
- 55“Alibaba has pledged to invest more than 380 billion yuan ($53 ...”. Retrieved March 25, 2026.
{"code":200,"status":20000,"data":{"title":"Bloomberg Technology on X: \"Alibaba has pledged to invest more than 380 billion yuan ($53 billion) on AI infrastructure such as data centers over the next three years https://t.co/puQUWz9XGh\" / X","description":"","url":"https://x.com/technology/status/1893856998879346946","content":"Don’t miss what’s happening\n\nPeople on X are the first to know.\n\n[Log in](https://x.com/login)\n\n[Sign up](https://x.com/i/flow/signup)\n\n## [](https://x.com/)\n\n
- 56“Alibaba Cloud's Apsara Conference 2025: Full Stack AI + Cloud ...”. Retrieved March 25, 2026.
{"code":200,"status":20000,"data":{"title":"not found","description":"","url":"https://www.alibabacloud.com/blog/alibaba-clouds-apsara-conference-2025-full-stack-ai-+-cloud-leads-the-way-to-the-future-of-ai_602567","content":"# not found\n\n[](https://www.alibabacloud.com/)\n\nModels Overview Products Solutions Pricing Resources Partners Support\n\nSee all results for \"\"\n\n[](https://www.alibabacloud.com/search?from=h5-global-nav-search)\n\n[