Analytics
Request per minute
Inference requests per minute
TTFT
Time to first token latency
TPS
Completion tokens per second
Duration
Inference duration
Cache Creation Input Tokens
Tokens written to the cache when creating a new entry
Cache Read Input Tokens
Tokens retrieved from the cache for this request
Success Rate
Percentage of successful requests