OpenAI GPT5 Nano
GPT-5-Nano pushes optimisation further for situations where every millisecond and cent count—voice assistants, real-time fraud filters, IoT gateways, or background content moderation. It retains the same enormous 400 k context window and tool-calling APIs but runs an aggressively quantised expert mix that triples tokens-per-second versus the flagship. Despite its size, Nano still scores 85.2 % on AIME 2025 and 75.6 % on MMMU multimodal reasoning, handily beating GPT-4.1-mini in code-generation and logic while using a fraction of the compute. Developers can combine Nano with the new minimal reasoning mode for sub-150 ms round-trips, yet fall back to higher effort on the same endpoint if deeper thought is occasionally required. At US $0.05 / M input tokens and US $0.40 / M output tokens, Nano unlocks billion-call workloads and on-device inference pilots without budget strain, all while benefiting from GPT-5’s upgraded safety guardrails and custom grammar-bound tool calls.
Capability
Vision Support
Tools
Function Calling
Context Window
128,000
Max Output Tokens
32,768
Using OpenAI GPT5 Nano with Python API
Using OpenAI GPT5 Nano with OpenAI compatible API
import openai
client = openai.Client(
api_key= '{your_api_key}',
base_url="https://api.model.box/v1",
)
response = client.chat.completions.create(
model="openai/gpt-5-nano",
messages: [
{
role: 'user',
content:
'introduce your self',
},
]
)
print(response)