Qwen3 Coder 480B A35B Instruct

Qwen3  Coder 480B A35B Instruct is the flagship release in Alibaba’s latest coding‑oriented model family. It uses a Mixture‑of‑Experts architecture with a colossal 480 billion parameters while activating only 35 billion per token, giving top‑tier quality at a realistic inference cost. Native context length reaches 256 K tokens and can be stretched to roughly one million tokens through YaRN‑style extrapolation, so the model comfortably ingests entire repositories, lengthy pull requests, and complex trace logs. Benchmark results place it at state‑of‑the‑art among open models for agentic coding, browser use, and tool use, roughly matching proprietary leaders such as Claude Sonnet 4.

The team scaled every stage of learning to turn raw capacity into reliable skill. Pre‑training swallowed 7.5 trillion tokens, about seventy percent of which is curated code, preserving the reasoning strengths of the broader Qwen3 backbone while specialising in software engineering. Post‑training adds two reinforcement‑learning phases. “Code RL” expands unit‑test suites automatically, rewarding solutions that compile and execute, which sharply raises pass rates. A second “Agent RL” phase teaches long‑horizon decision‑making on tasks like SWE‑Bench by running twenty thousand parallel environments on Alibaba Cloud, letting the model plan, call tools, receive feedback, and iterate until bugs are fixed, yielding state‑of‑the‑art scores without any test‑time tricks.

Developers can adopt Qwen3 Coder immediately. The open‑source Qwen Code CLI (a fork of Gemini Code) wraps tailored prompts and function calls to unlock agentic workflows, and the model also slots into Claude Code or any OpenAI‑compatible client by pointing to DashScope’s endpoint. An API key from Alibaba Cloud Model Studio is all that is required. More model sizes are promised, along with research into self‑improving agents, signalling a practical step toward autonomous coding assistance at industrial scale.