Insights

Insights in TokenVue show live operational signals from your workspace's LLM gateway traffic.

The Insights page combines TokenVue usage logs with public model market data so teams can understand request volume, token usage, latency, failures, guardrail activity, provider performance, and estimated model cost.

Insights Overview

What Insights Shows

Insights helps teams understand how their LLM traffic is performing.

You can use it to review:

Total requests
Total tokens processed
Estimated spend
P95 latency
Model traffic share
Provider performance
Guardrail blocks
Service-level hotspots
Idle LLM routes
Public model pricing and context data

Data Sources

TokenVue Insights uses two main data sources:

Source	Description
Workspace usage logs	Live request, token, latency, provider, model, status, and guardrail data from TokenVue traffic.
Public model market data	Public pricing and context information used to estimate spend and compare active models.

If public market data is unavailable, local workspace insights still work, but pricing and context comparisons may be reduced.

Time Windows

Insights can be filtered by time window.

Available views include:

This helps teams compare recent activity against longer-term usage patterns.

Generated Insights

TokenVue generates operational insight cards from live gateway activity.

These cards can highlight:

Latency hotspots
Failure-heavy services
Models driving the largest estimated spend
Guardrail block patterns
Idle configured routes
Models that may be over-provisioned for current prompt sizes

Generated Insights

Provider Performance

The Provider Performance table shows how each provider is behaving in the selected time window.

It includes:

Request count
Success rate
P95 latency
Total tokens
Estimated spend

This helps teams compare providers by reliability, speed, cost, and usage.

Market-Matched Active Models

TokenVue can match active workspace models against public model catalog data.

When a match is available, Insights can show:

Context length
Input cost per 1K tokens
Output cost per 1K tokens
P95 latency
Estimated spend

This helps teams understand whether current model choices are cost-effective for actual usage.

Service Hotspots

Service Hotspots show key-level pressure across virtual keys.

This view helps identify which services or workloads are responsible for the most traffic, cost, failures, blocks, or latency.

It includes:

Virtual key
Model
Request count
Input tokens
Output tokens
Total tokens
Estimated cost
Failures
Blocked requests
P95 latency
Last seen time

Service Hotspots

Guardrail Breakdown

The Guardrail Breakdown section shows blocked requests grouped by detected guardrail signature.

Examples include:

Hard Budget Cap
PII Scrubbing
Toxicity Filter
Injection Detection
Keywords
Gateway Block

This helps teams understand which protections are being triggered most often.

Idle LLM Routes

Idle LLM Routes show configured provider/model routes that have no traffic in the selected time window.

This is useful for cleanup, cost reviews, and checking whether expected routes are actually being used.

Best Practices

Review Insights after connecting production traffic.
Use time windows to compare short-term issues with longer-term trends.
Watch P95 latency instead of only average latency.
Check Service Hotspots to find noisy or expensive virtual keys.
Use Guardrail Breakdown to tune workspace policy.
Review idle routes regularly to remove unused configurations.
Compare estimated spend against provider billing for validation.

In Short

Insights is the analytics layer of TokenVue.

It helps teams understand how LLM traffic behaves across virtual keys, models, providers, costs, latency, failures, and guardrail activity.

What Insights Shows​

Data Sources​

Time Windows​

Generated Insights​

Provider Performance​

Market-Matched Active Models​

Service Hotspots​

Guardrail Breakdown​

Idle LLM Routes​

Best Practices​

In Short​

What Insights Shows

Data Sources

Time Windows

Generated Insights

Provider Performance

Market-Matched Active Models

Service Hotspots

Guardrail Breakdown

Idle LLM Routes

Best Practices

In Short