Qwen3-30B-A3B | ModelBox

Qwen3-30B-A3B is the “Pro” MoE tier designed for balanced cost-to-quality. It contains 30.5 billion total parameters, but activates only ~3.3 billion at inference—yielding GPT-3.5-class quality at a fraction of the memory footprint. The architecture mirrors the 235 B model (128 experts, 8 routed) but runs 48 transformer layers with 32/4 GQA heads. It retains the 32 k native context window, YaRN compatibility, and the same controllable thinking switch that lets developers trade raw reasoning traces for latency.