Claude 4.5 Haiku
Claude Haiku 4.5 is Anthropic’s newest small model designed for speed, efficiency, and high practical accuracy. Anthropic positions it as delivering performance similar to the mid-tier Claude Sonnet 4 from May 2025, while cutting cost to roughly one third and more than doubling throughput, which makes it attractive for large scale deployments and real time work. It is available broadly in the Claude app, on the web, and via API, making the entry point simple for teams that want fast agents and chat. Pricing is aggressively low at about 1 dollar per million input tokens and 5 dollars per million output tokens, so developers can prototype and ship without heavy inference bills. Beyond raw speed, Haiku 4.5 supports long context use cases and modern agent features like extended thinking, computer use, and better context awareness, which help it handle multi step tasks, code assistance, and document synthesis with low latency. Early coverage highlights that it rivals larger 4.5 family models for everyday coding, research, and workflow automation, and that it is accessible to general users, which should broaden adoption. Overall, Haiku 4.5 is a pragmatic choice when you need near frontier quality, very fast responses, and predictable costs for production scale workloads.
Capability
Vision Support
Tools
Function Calling
Context Window
200,000
Max Output Tokens
8,192
Using Claude 4.5 Haiku with Python API
Using Claude 4.5 Haiku with OpenAI compatible API
import openai
client = openai.Client(
api_key= '{your_api_key}',
base_url="https://api.model.box/v1",
)
response = client.chat.completions.create(
model="anthropic/claude-4-5-haiku",
messages: [
{
role: 'user',
content:
'introduce your self',
},
]
)
print(response)