Tongyi
Qwen

Qwen3-32B

qwen/qwen3-32b

Qwen3-32B is a dense 32.8 billion-parameter model, positioned as the high-accuracy single-expert counterpart to the MoE line. It uses 64 transformer layers with 64/8 GQA heads and the full 32 k context window (extendable via YaRN). Because every parameter is active, it excels at deterministic generation, agentic tool-calling, and creative writing where dense representations can outperform similarly sized MoE peers. It is drop-in compatible with Hugging Face Transformers ≥ 4.51, vLLM, SGLang, and common GGUF/MLX-LM ports.

Tools

Function Calling

Context Window

32,768

Max Output Tokens

8,192

ProviderInput Token PriceOutput Token Price
DashScope$0.30/Million Tokens$3/Million Tokens