Domain-Oriented Observability
Domain-oriented observability treats business-meaningful events (cart abandoned, payment declined, signup completed) as first-class instrumentation, alongside or instead of low-level technical telemetry.
Understand This First
- Observability – the general idea of inferring internal state from external outputs.
- Metric – the measurement primitive that domain signals are expressed in.
- Logging – one of the plumbing layers that domain probes hide.
What It Is
Most production systems are instrumented from the bottom up. Requests per second, CPU load, p99 latency, error rate, log lines per minute. These signals answer one question very well: is the software running? They answer a different question badly: is the software doing its job?
Domain-oriented observability reframes instrumentation around the second question. Instead of counting requests, you count carts abandoned at checkout. Instead of tracking error rate, you track the rate at which payments are declined by the gateway. Instead of logging POST /api/v2/orders 200 OK in 312ms, you record order.placed(customer_tier=premium, line_items=4, currency=JPY, total=18400). The events are named in the language the business speaks. A product manager can read the dashboard without a translator. An on-call engineer can tell at a glance whether a deploy broke revenue, not just whether it broke a process.
The implementation pattern is the Domain Probe. A probe is a small, high-level object the domain code calls directly (cart.abandoned(reason), payment.declined(gateway, code), signup.completed(channel)), hiding whatever telemetry plumbing happens underneath. The application writes to the probe; the probe fans the event out to logs, metrics, traces, analytics, or any combination, without leaking any of that plumbing into the business logic. Pete Hodgson and Martin Fowler wrote up the pattern in 2019; that article is still the reference most practitioners cite.
Why It Matters
Traditional observability tells you what is broken. Domain-oriented observability tells you whether the thing is working. The difference matters most when something is technically fine and substantively wrong.
Consider a checkout flow that silently drops a 10% discount code because of a serialization bug. The endpoint returns 200 OK. Latency is normal. Error rate is zero. Every technical signal says the system is healthy. Only a domain-level metric, such as average applied-discount per cart or coupon-redemption rate by campaign, catches the problem. This class of failure is everywhere: the system is green, the business is bleeding, and nobody notices for a week.
The distinction has become more pressing as agents take on more production code. An agent can refactor a checkout routine, keep every test passing, and quietly change the rounding rule that applies to yen-denominated orders. Only a metric that knows what “average order value in JPY” should look like will catch it. Agent-generated code passes technical tests all the time; passing business intent is a separate bar, and only domain-level signals measure it.
The industry has also started giving this capability a name. IBM, Grafana Labs, and several vendor roadmaps in 2026 list “business observability” or “domain-driven observability” as a distinct category, separate from infrastructure observability. Mainstream platforms are shipping it as a feature — Datadog Experiments, launched in April 2026, embeds product experimentation directly into the observability stack and connects product changes to business outcomes in one place. The market is catching up to what Hodgson and Fowler wrote down seven years ago.
There’s also a language-and-clarity argument. When your probes are named in domain terms, your instrumentation code reads like the rest of your domain code. Nobody has to translate http_request_duration_seconds_bucket{route="/api/v2/orders",status="200",le="0.5"} into “did a customer successfully place an order.” The name is order.placed. The signal is the thing.
How to Recognize It
A few signs mark a system that has this discipline. The instrumentation vocabulary matches the business vocabulary: events have names like invoice.generated rather than POST /invoice 201. The probe is an explicit seam in the code, distinct from the telemetry backend it writes to, so you can swap logging frameworks or metrics systems without touching domain logic. The dashboards a product owner cares about (conversion rate, time to first value, failed-payment rate) are derived from the same probes the engineers use to debug, not from a parallel analytics pipeline that drifts out of sync.
You can also recognize the pattern by what it is not. A dashboard that reports CPU and request count is pure infrastructure observability. A dashboard that reports pageviews through a third-party analytics tag is marketing analytics. Neither gives you a single source of truth for “is the software fulfilling its purpose,” owned by the same team that writes the code.
How It Plays Out
A team running an insurance quote system notices that quote-to-bind conversion has fallen three points, but every technical dashboard is green. They built their instrumentation the old way: request counts, error rates, database latencies. There is no single signal that says “fewer people are buying.” They spend three days tailing logs and pulling analytics reports before they find the cause: a new validation rule is rejecting policies with ZIP codes in Puerto Rico as malformed. The next quarter, they introduce domain probes (quote.requested, quote.priced, quote.rejected(reason), policy.bound) and wire them into dashboards keyed to the same funnel a product manager uses. The next time conversion drops, they see within minutes that quote.rejected(reason="invalid_zip") has spiked for a specific state. The loop between “something is wrong” and “here is what” collapses from days to one dashboard click.
In an agentic coding workflow, an agent is given ownership of a checkout service and a continuous task: keep the system green. If its only signals are technical, the agent optimizes what it can see (latency, error rate, test pass rate) and misses that its own refactor silently broke coupon handling. Now give the agent access to domain probes: cart.abandoned, coupon.applied, order.value_usd. After each change, it checks that the post-deploy distributions match pre-deploy. When coupon-application rate halves, the agent rolls back without waiting for a human to notice revenue has dropped. Domain observability becomes the agent’s test oracle for changes that no unit test can cover.
Write probes first, sinks second. Design the domain-level API (cart.abandoned(reason), payment.declined(gateway, code)) before you decide whether each event becomes a log line, a metric, a trace span, or all three. Calling code shouldn’t care which backend is used today or tomorrow.
Consequences
Domain-oriented observability gives you signals that correspond to outcomes the business actually cares about. Debugging gets faster because the dashboard already speaks the language of the problem. Product, engineering, and on-call share one source of truth instead of reconciling three. Agents operating inside the system get a better feedback loop, because their probes now watch the thing that matters, not just the thing that’s easy to instrument.
The costs are real. Domain probes add a layer of abstraction that new engineers have to learn, and a poorly designed probe can duplicate information that already exists in logs or metrics. Teams often end up with two vocabularies for a while, the old infra signals and the new domain probes, and discipline is required to pick one per question and stick with it. There’s also a governance burden. Because domain events carry business-meaningful data, they’re more likely to contain personally identifiable information, so the same care that applies to databases now applies to the observability pipeline. And the probes require design: a probe named thing.happened with no structured payload is worse than a well-written log line, because it encodes the illusion of understanding without the substance.
The biggest trap is probe drift. When the business changes (new tiers, new flows, new currencies), the probes have to move with it. A probe called checkout.completed that stopped firing three months ago because the checkout code was reorganized is not an observability gap the infrastructure team will catch. Treat probes as part of the domain model they serve, subject to the same reviews as the code around them.
Related Patterns
- Refines: Observability – domain-oriented observability is a specific discipline inside the broader practice of making systems observable.
- Uses: Metric – domain events are expressed as metrics, traces, or structured logs.
- Uses: Logging – logs are one of the sinks a domain probe writes to.
- Enables: Service Level Objective – SLOs defined on domain events are more meaningful than SLOs on raw technical signals.
- Enables: Feedback Loop – business-keyed signals are the richest input a closed loop can act on.
- Complements: Performance Envelope – the envelope describes how the system behaves technically; domain observability describes whether it’s delivering value.
- Depends on: Domain Model – probe names are borrowed from the domain’s language; without a shared model, probes drift.
- Related: Ubiquitous Language – the naming discipline that keeps probes, code, and conversation aligned.
- Related: Silent Failure – domain observability is the primary defense against silent failures at the business layer, where technical signals stay green.
- Related: Eval – both measure outcomes an agent produces; evals score individual runs, domain probes watch populations in production.
Sources
Pete Hodgson and Martin Fowler defined the pattern in Domain-Oriented Observability (martinfowler.com, 2019), introducing the Domain Probe as the core implementation seam.
The practice of treating business-meaningful events as primary telemetry has roots in Gregor Hohpe and Bobby Woolf’s Enterprise Integration Patterns (2003), which argued for message events named in the language of the business rather than the transport.
Charity Majors, Liz Fong-Jones, and George Miranda’s Observability Engineering (O’Reilly, 2022) popularized wide events with high cardinality as the unit of observability, a prerequisite for carrying domain-rich payloads without dashboards collapsing under cost.
Eric Evans’s Domain-Driven Design (2003) gave the industry the habit of pinning code to a ubiquitous language; domain-oriented observability extends the same habit to instrumentation.