What PostgreSQL Metrics Tell Us and What They Don’t

A metric is, at its simplest, a number.

More specifically, it’s a numeric measurement collected over time that represents an aspect of a system’s behavior. That number might increase, decrease, reset, or be sampled periodically, but it is always a reduction of system behavior into a measurable value.

In modern monitoring systems—particularly those built around Prometheus—metrics fall into a few broad categories:

Counters: numbers that only increase (e.g., requests served)
Gauges: values that move up and down (e.g., connections in use)
Histograms: distributions of observed values (e.g., latency)

These types describe how a number behaves over time, not what it means. For a detailed explanation of metric types and their semantics, the Prometheus documentation is the authoritative reference.

When talking about observability, it’s often useful to borrow a concept from systems thinking: the distinction between known knowns, known unknowns, and unknown unknowns. In this framing, known knowns are things we already expect to measure — we know what they represent, what “healthy” looks like, and we care when they change. Known unknowns are questions we might want answers to but don’t yet have predefined signals for, and unknown unknowns are behaviors that surprise us until they happen.

For a deeper discussion of how this framework applies to observability, Charity Majors explores these ideas in her writing on structured events and signal design. live your best life with structured events

This post is part of an ongoing series built around a small PostgreSQL observability lab. The lab is a runnable, local environment that brings together PostgreSQL, common observability tooling, and intentionally simple workloads to make assumptions about monitoring and reliability visible.

The repository is available at github.com/mdbdba/pgs_obs_demo. Running the lab isn’t required to follow the series, but it provides a concrete reference point as the discussion moves from ideas to observable behavior.

Metrics are strongest at “known knowns”

One of the most useful ways to think about metrics is that they excel at known knowns.

Metrics are extremely effective when:

You know what behavior matters
You know how it should look when healthy
You want to detect when it deviates

Throughput dropping, error rates climbing, saturation increasing—these are all things we expect might happen, and metrics are well suited to telling us when they do.

Where metrics struggle is when we ask them to explain behavior we didn’t anticipate. When the problem isn’t that something changed, but why it changed, metrics often give us signals without narratives. They confirm that the system is different, but not how it arrived there.

This doesn’t make metrics weak. It makes them specific. Understanding that specificity is key to using PostgreSQL metrics well.

Where PostgreSQL metrics come from (and what kind of truth they represent)

In this lab, PostgreSQL metrics are exposed via postgres_exporter and collected by Prometheus. This architecture is common, well understood, and intentionally simple.

It’s also important to understand what kind of truth this setup provides.

The PostgreSQL server maintains a large amount of internal state: counters, gauges, and statistics reflecting query execution, concurrency, memory usage, I/O behavior, and background activity. postgres_exporter reads a subset of that state and translates it into numeric time series. Prometheus then scrapes those numbers at regular intervals and stores them for querying and alerting.

This process is inherently reductive: rich internal behavior is collapsed into numbers sampled over time. Prometheus doesn’t observe events directly; it observes snapshots of accumulated state. That design choice is what makes metrics efficient, cheap to store, and easy to reason about at scale — and it’s also what shapes their limitations.

The goal of this section isn’t to critique this model, but to be explicit about it. Understanding how metrics are produced makes it much easier to understand what they can — and can’t — tell us later.

Orientation vs explanation, applied to PostgreSQL metrics

In the previous post, I introduced a distinction between signals that help us orient and signals that help us explain. PostgreSQL metrics make this distinction especially clear.

Some metrics move early when system behavior changes. They tell us that something is different, even if they don’t yet explain why. Others provide detailed insight into what’s happening inside the database, but only once attention is already focused and context is available.

The important distinction here isn’t between “good” and “bad” metrics. It’s between early signals and explanatory signals — and understanding which role a metric tends to play.

In the lab dashboards, this separation is intentional. High-level activity and pressure signals are surfaced first, while more detailed breakdowns are available once you know where to look. This mirrors how metrics behave in real systems, whether we design for it or not.

Before diving deeper, it helps to make the metric data flow explicit. In this lab, PostgreSQL metrics move through a simple and common pipeline: internal database statistics are exposed by an exporter, scraped and stored as time series, and then queried for visualization and alerting.

PostgreSQL
    ↓ is queried by
postgres_exporter
    ↓ is scraped by
Prometheus
    ↓ is queried by
Grafana

Metrics that tend to orient well

Some PostgreSQL metrics are particularly good at helping us orient when something changes.

These tend to be metrics that reflect overall activity or pressure: how much work the system is doing, how many things are happening concurrently, or whether resources are becoming constrained. When these metrics drift, they often do so before users complain or alerts fire elsewhere.

What makes these metrics useful is not precision, but sensitivity. They trade detail for early signal. They raise simple questions:

Is load increasing?
Is concurrency higher than usual?
Is the system spending more time waiting than before?

In the lab, these metrics appear early in the Prometheus → Grafana path and are intentionally unspecific. Their job isn’t to explain behavior, but to tell us that the system has entered a different operating state.

Metrics that explain well — but usually too late

Other PostgreSQL metrics are invaluable for explanation, but rarely help with early detection.

Metrics related to lock contention, wait events, and query-level behavior can provide high-fidelity insight into what the database is doing. When you already know which subsystem is under stress, these metrics can be decisive.

The tradeoff is timing and context. These metrics often become meaningful only after attention is already focused and hypotheses are being tested. Without orientation, they can be overwhelming or misleading.

In the lab, many of these metrics are exposed through postgres_exporter and are available for inspection — but they only become useful once you have a reason to look at them. This isn’t a flaw; it’s a reflection of their role.

Metrics that routinely mislead (without being wrong)

Some PostgreSQL metrics are accurate and well-defined, yet still lead investigations astray.

Common patterns include averages that hide variance, global metrics that look healthy while localized pain exists, or counters interpreted without accounting for resets and sampling behavior. In each case, the metric itself is not lying — it’s being asked to answer a question it wasn’t designed to answer.

A useful rule of thumb is this: a metric can be correct and still be misleading.

For example, in the lab it’s possible to observe periods where average query latency remains stable while concurrency and tail latency increase. The metric isn’t wrong—it accurately reflects the average—but it hides the fact that a growing subset of queries is slowing down under load.

Rather than collapsing these signals into a single “health” indicator, the lab makes this visible by showing multiple perspectives side by side.

A walkthrough of this scenario—where average latency remains stable while concurrency and tail latency increase—is documented in the lab repository to show how these signals can appear contradictory at first glance.

Why metrics rarely form a narrative on their own

Metrics excel at compressing behavior over time. That compression is what makes them scalable, but it also makes narrative reconstruction difficult.

Databases, in particular, exhibit phase changes: a system can move smoothly from healthy to degraded without any single metric clearly announcing the transition. Cause and effect are often indirect, and changes propagate through layers of caching, scheduling, and contention.

This is where the idea of known knowns becomes useful. Metrics are excellent at confirming expected deviations — the things we already expect to watch. When the system surprises us, metrics often tell us that something changed without telling us what happened.

This doesn’t make metrics bad. It makes them specific.

What this means for observability design (not implementation)

If we accept these characteristics, a few design implications follow.

Metrics are most effective when treated as early warning signals and hypothesis generators. They help us notice change, narrow the search space, and decide where to look next. They are less effective when asked to explain unexpected behavior in isolation.

In this lab, metrics are intentionally treated as one slice of a broader observability story. The goal isn’t completeness, but usefulness — and an honest accounting of tradeoffs between signal fidelity, cost, and operational complexity.

Where this goes next

In the next post, I’ll focus on how metrics behave over time and under change — what drifts early, what spikes late, and how signals interact when the system crosses thresholds rather than failing outright.

From there, we’ll look at logs and structured events as complementary signals, and at alerting and SLOs as decision systems built on top of imperfect information.

As with the lab itself, the goal isn’t to arrive at final answers, but to make assumptions visible.

Tags: postgresql observability sre databases

What this PostgreSQL Observability Lab Is – and What I’m Trying to Learn From It

Blog Archive

Archive of all previous blog posts