OpenAI GPT5 Nano

GPT-5-Nano pushes optimisation further for situations where every millisecond and cent count—voice assistants, real-time fraud filters, IoT gateways, or background content moderation. It retains the same enormous 400 k context window and tool-calling APIs but runs an aggressively quantised expert mix that triples tokens-per-second versus the flagship. Despite its size, Nano still scores 85.2 % on AIME 2025 and 75.6 % on MMMU multimodal reasoning, handily beating GPT-4.1-mini in code-generation and logic while using a fraction of the compute. Developers can combine Nano with the new minimal reasoning mode for sub-150 ms round-trips, yet fall back to higher effort on the same endpoint if deeper thought is occasionally required. At US $0.05 / M input tokens and US $0.40 / M output tokens, Nano unlocks billion-call workloads and on-device inference pilots without budget strain, all while benefiting from GPT-5’s upgraded safety guardrails and custom grammar-bound tool calls.

Using OpenAI GPT5 Nano with Python API