All labs
Lab 61
Secure & Observable AI

Metrics, Logs & Traces for an AI Agent

One agent run through all three observability lenses. Metrics (tokens, LLM latency, failed tools, and a faithfulness eval) tell you something's wrong; correlation-id logs let you filter one request's journey out of the noise; and the trace waterfall shows which span ate the time. The headline: a hallucination returns HTTP 200 with green latency — only the eval catches it. For agents, success ≠ correctness.

One run of the contract-risk agent, through all three observability lenses. Inject a problem, run it, and read the metrics, then the logs (turn on the correlation id), then the trace. Note that a hallucination stays HTTP 200 — only the faithfulness metric catches it.
What just happened