DeepSeek
Deepseek

DeepSeek Reasoner(r1)

deepseek/deepseek-reasoner

The first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1, were introduced to advance reasoning capabilities. DeepSeek-R1-Zero, developed using large-scale reinforcement learning (RL) without prior supervised fine-tuning (SFT), displayed impressive reasoning performance. Through RL, it naturally acquired a range of powerful and intriguing reasoning behaviors. However, DeepSeek-R1-Zero faced challenges such as repetitive outputs, poor readability, and language mixing. To address these limitations and further improve reasoning capabilities, DeepSeek-R1 was developed, incorporating cold-start data before RL. DeepSeek-R1 demonstrated performance on par with OpenAI-o1 across tasks involving mathematics, coding, and reasoning. To foster progress within the research community, DeepSeek-R1-Zero, DeepSeek-R1, and six distilled dense models based on Llama and Qwen were open-sourced. Among them, DeepSeek-R1-Distill-Qwen-32B surpassed OpenAI-o1-mini on various benchmarks, setting new performance standards for dense models.

Community

Open Source

Context Window

128,000

Max Output Tokens

8,000

Using DeepSeek Reasoner(r1) with Python API

Using DeepSeek Reasoner(r1) with OpenAI compatible API

import openai

client = openai.Client(
  api_key= '{your_api_key}',
  base_url="https://api.model.box/v1",
)
response = client.chat.completions.create(
model="deepseek/deepseek-reasoner",
messages: [
  {
    role: 'user',
    content:
      'introduce your self',
    },
  ]
)
print(response)