Analytics

Request per minute

Inference requests per minute

TTFT

Time to first token latency

TPS

Completion tokens per second

Duration

Inference duration

Cache Creation Input Tokens

Tokens written to the cache when creating a new entry

Cache Read Input Tokens

Tokens retrieved from the cache for this request

Success Rate

Percentage of successful requests

Requests by Model

Requests by Model Provider