Kimi K2 Instruct | ModelBox

Kimi K2 Instruct is Moonshot AI’s flagship open-source large-language model, unveiled in July 2025. It employs a Mixture-of-Experts design that activates 32 billion parameters out of a one-trillion-parameter pool, achieving GPT-4-class capacity while retaining efficiency for fine-tuning and serving on multi-GPU nodes. Trained on 15.5 trillion tokens with the Muon optimizer and then rigorously instruction-tuned, the model excels at knowledge work, multi-step reasoning, coding, and autonomous tool use. Early benchmarks place K2 at or above GPT-4 on several enterprise-relevant tasks and comfortably ahead of other open-source peers such as DeepSeek-V3. Moonshot positions this release as a strategic move to bolster its standing in China’s competitive AI landscape.

A standout feature is the 128 000 token context window, enabling the ingestion of entire codebases, lengthy research papers, or days-long chat histories in a single request. The architecture includes 384 experts, eight of which are selected per token, and leverages Multi-Head Linear Attention to keep latency manageable even with long sequences. These choices allow K2 to maintain strong performance across both English and Chinese benchmarks while remaining computationally tractable.

Developers can experiment immediately via the Hugging Face repository, OpenRouter, or Moonshot’s native API. Deployment guides detail setups ranging from eight-GPU clusters to a single AMD MI300X server using eight-way tensor parallelism. The Apache-2.0 licence permits commercial use, making Kimi K2 Instruct an accessible foundation for building high-capacity agentic applications, autonomous code assistants, and retrieval-augmented systems.

Provider	Input Token Price	Output Token Price
Groq	$1.00/Million Tokens	$3/Million Tokens
Moonshot	$0.60/Million Tokens	$2.5/Million Tokens
novita	$0.57/Million Tokens	$2.3/Million Tokens