OpenAI
OpenAI

OpenAI GPT5 Nano

openai/gpt-5-nano

GPT-5-Nano pushes optimisation further for situations where every millisecond and cent count—voice assistants, real-time fraud filters, IoT gateways, or background content moderation. It retains the same enormous 400 k context window and tool-calling APIs but runs an aggressively quantised expert mix that triples tokens-per-second versus the flagship. Despite its size, Nano still scores 85.2 % on AIME 2025 and 75.6 % on MMMU multimodal reasoning, handily beating GPT-4.1-mini in code-generation and logic while using a fraction of the compute. Developers can combine Nano with the new minimal reasoning mode for sub-150 ms round-trips, yet fall back to higher effort on the same endpoint if deeper thought is occasionally required. At US $0.05 / M input tokens and US $0.40 / M output tokens, Nano unlocks billion-call workloads and on-device inference pilots without budget strain, all while benefiting from GPT-5’s upgraded safety guardrails and custom grammar-bound tool calls.

Capability

Vision Support

Tools

Function Calling

Context Window

128,000

Max Output Tokens

32,768

Using OpenAI GPT5 Nano with Python API

Using OpenAI GPT5 Nano with OpenAI compatible API

import openai

client = openai.Client(
  api_key= '{your_api_key}',
  base_url="https://api.model.box/v1",
)
response = client.chat.completions.create(
model="openai/gpt-5-nano",
messages: [
  {
    role: 'user',
    content:
      'introduce your self',
    },
  ]
)
print(response)