A booking request is slow. See the same incident through a metric (latency spiked), then logs (booking errors), then a trace (an 8s payment span). Why you need metrics, logs AND traces โ and how they hand off.
One slow appointment-booking request, seen through each of the three pillars. Click between them and watch the picture sharpen from "something's wrong" to "it's the payment service."
Booking latency (p99, per minute)
p99 latency
8.1 s
Error rate
6.2%
๐ Something is wrong โ latency and errors spiked. But a metric can't tell you which service.
The full story: the metric said "booking is slow," the logs said "payment timed out in the appointment service," and the trace showed the Payment span took 8s. One pillar alone leaves you guessing; together they pinpoint it.
What just happened
โนObservability stands on three pillars. Metrics are numbers over time (latency, error rate) โ they tell you SOMETHING is wrong. Logs are detailed event records โ they tell you WHICH service and WHAT error. Traces follow one request across services โ they tell you WHERE the time went.
โนNo single pillar is enough. A metric spike doesn't say which service; a log line doesn't show the whole journey; a trace alone won't tell you it's happening to thousands of users. You triangulate with all three.
โนHere, the same slow booking shows up as a latency spike (metric), a 'payment timed out' error (log), and an 8-second Payment span (trace). Together they take you from 'something's slow' to 'it's the payment dependency' in seconds.