Thinnest Viable Platform

Pattern

Build the smallest platform that lets stream-aligned teams deliver autonomously, then grow it only in response to real demand.

Also known as: TVP, Minimum Viable Platform

Understand This First

Platform as a Product – TVP is the sizing principle for product-oriented platforms. Without the product mindset, “thinnest” degenerates into “cheapest.”
Team Cognitive Load – the platform’s purpose is to absorb complexity that would otherwise overflow every team’s cognitive budget. TVP asks: what’s the minimum surface that achieves that?
Stream-Aligned Team – the teams the platform serves. Their autonomy is the measure of whether the platform is thick enough.

You’ve decided to treat internal infrastructure as a product. A platform team exists. It has a mandate to build shared capabilities. The question is no longer whether to build a platform but how much platform to build.

The temptation is to build too much. Platform teams with strong engineers and organizational support tend to imagine the ideal state: a complete internal PaaS with self-service everything, golden paths for every workflow, and a polished developer portal. That vision isn’t wrong. The problem is trying to get there before the stream-aligned teams need it. A platform built ahead of demand accrues maintenance cost without delivering value. Capabilities nobody asked for sit unused while the capability teams actually need doesn’t exist yet.

Problem

How do you decide what to include in an internal platform and what to leave out? Build too little and teams solve the same problems independently, wasting effort and creating inconsistency. Build too much and the platform team spends its capacity maintaining features that don’t get used while ignoring the friction that matters most.

Forces

Stream-aligned teams need to ship without waiting for the platform team. Every capability the platform lacks is a gap they’ll fill with their own improvised solution.
Every capability the platform offers is a promise: it will keep working, handle edge cases, and evolve with changing needs. Promises carry maintenance cost.
Platform teams face the same cognitive load constraints as any other team. A sprawling platform overwhelms the team that owns it.
Demand is hard to predict in advance. What teams say they need and what they actually adopt are often different.
Unused capabilities aren’t free. They consume engineering time, create false expectations, and clutter the platform’s surface area.

Solution

Start with the single capability that causes the most friction across teams, make it self-service, and stop. Don’t plan the second capability until the first is adopted and stable. Grow the platform one proven capability at a time, driven by observed demand rather than anticipated need.

Skelton and Pais coined the term thinnest viable platform in Team Topologies (2019) to counter the instinct that more platform is always better. “Thinnest” means you include only what teams can’t reasonably do themselves. “Viable” means it actually works: reliable, documented, and self-service. A half-built capability that requires filing a ticket to use isn’t viable, no matter how thin.

The sizing test is straightforward: can stream-aligned teams deliver their work end-to-end without waiting for another team? If yes, the platform is thick enough. If they’re blocked waiting for infrastructure changes, provisioning, or access grants, the platform has a gap. If they’re ignoring a platform capability and building their own version, either the capability doesn’t fit their needs or they don’t know it exists. Both are product problems worth investigating.

In practice, TVP means the platform team maintains a short list of supported capabilities rather than a long one. Each capability on the list gets full product treatment: self-service interface, documentation, monitoring, and an owner responsible for keeping it healthy. Anything not on the list is explicitly out of scope. Teams that need unsupported capabilities build their own and accept the maintenance burden. Some of those ad-hoc solutions will eventually become platform candidates if enough teams converge on the same need.

The principle extends beyond infrastructure tooling. In organizations adopting AI agents, the platform might include a standard instruction file template, a shared memory configuration, or pre-built hooks for security scanning. The TVP question applies identically: which of these creates enough friction across enough teams to justify centralized support? Start there.

How It Plays Out

A company of six stream-aligned teams decides to build an internal platform. The platform team surveys every team about pain points. The list is long: deployment configuration, database provisioning, secret management, log aggregation, CI pipeline templates, and SSL certificate rotation. A traditional approach would roadmap all six, estimate timelines, and start building.

The TVP approach is different. The platform team looks at where teams are actually losing time. Deployment configuration is the answer. Every team has its own Kubernetes manifests, copied from another team’s repo and half-understood. Two recent outages trace back to misconfigured health checks in copied configs. The platform team builds a single thing: a service.yaml schema that generates correct Kubernetes configs from a handful of inputs (service name, port, resource limits, health check endpoint). One command, five minutes, correct every time.

They ship it, announce it, and watch. Within three weeks, four of six teams have switched. One team finds an edge case (a service that needs a custom sidecar) and the platform team adds support for it. The sixth team runs a different orchestration stack entirely. The platform team notes the gap and doesn’t force migration.

Only after the deployment tool is stable and adopted does the platform team tackle the second capability: database provisioning. They chose it because two teams asked for it independently and a third team’s developer mentioned it in a retro. Secret management, which seemed equally urgent on the original survey, turns out to be less painful in practice. Teams found an open-source solution that works well enough. The platform team doesn’t duplicate it.

A startup building AI-powered features across several product teams faces a different version of the same problem. Each team configures its AI agents differently. One team has a meticulous verification loop that catches most issues. Another team runs agents with minimal guardrails and regularly ships code that breaks in staging. The platform team could build a comprehensive agent governance framework. Instead, they apply TVP: what’s the thinnest agent infrastructure that makes the biggest difference?

They identify the gap as pre-commit verification. The team with the verification loop already solved it; the platform team packages that solution as a shared hook that runs linting, type checking, and a quick test suite before any agent-generated commit lands. One capability, easy to adopt, immediately visible in reduced staging failures. The comprehensive governance framework stays on the wish list until enough teams are ready for it.

Consequences

TVP keeps the platform team focused. A short list of capabilities means each one gets the attention it needs: real documentation, real monitoring, real maintenance. The platform team isn’t spread across a dozen half-finished tools. It owns a few things well.

Stream-aligned teams get reliability instead of breadth. A platform that does three things excellently is more useful than one that does ten things unreliably. Teams learn to trust the supported capabilities because they work. That trust is hard to build and easy to lose. A single broken capability that stays broken for weeks undermines confidence in the entire platform.

The discipline of waiting for demand means some teams will solve problems independently that the platform could have solved centrally. That’s an acceptable cost. The alternative, building capabilities before demand exists, produces worse outcomes: unused features that clutter the platform, maintenance burden that slows down work on things teams actually need, and a false sense of coverage that masks real gaps.

TVP requires saying no. Teams will request capabilities the platform team isn’t ready to support. Product managers will argue for building ahead of demand to “be ready.” The platform team needs organizational backing to defer capabilities until evidence supports building them. Without that backing, the platform grows beyond the team’s capacity to maintain it, and quality drops across the board.

There’s a startup-stage caveat. An organization with two teams might not need a platform at all. The coordination overhead of maintaining shared tooling exceeds the benefit when the total number of consumers is small. TVP’s lower bound isn’t “thin.” It’s zero. At that scale, a wiki page describing how each team sets things up provides more value than a platform team.

Constrains: Platform as a Product – TVP is the sizing discipline that prevents a product-oriented platform from becoming an overbuilt internal PaaS.
Serves: Stream-Aligned Team – the definition of “viable” is whether stream-aligned teams can deliver autonomously.
Reduces: Team Cognitive Load – each platform capability absorbs complexity from every team that adopts it. TVP keeps the absorption focused on the complexity that matters most.
Uses: Feedback Loop – adoption signals (who uses what, who builds their own alternative) are the feedback that determines what the platform builds next.
Complements: Enabling Team – an enabling team teaches a practice; the platform encodes it in tooling. TVP determines which practices merit encoding versus which are better taught and left to teams.
Informed by: YAGNI – the same principle at organizational scale: don’t build platform capabilities you don’t have evidence teams need.
Contrasts with: Big Ball of Mud – a platform that grows without discipline becomes its own ball of mud, with tangled dependencies and unclear ownership of each capability.

Sources

Matthew Skelton and Manuel Pais coined the term “thinnest viable platform” in Team Topologies: Organizing Business and Technology Teams for Fast Flow (2019). Their framing: the platform should be the smallest set of self-service APIs, tools, and services that lets stream-aligned teams deliver autonomously. They deliberately chose “thinnest” over “minimum” to emphasize that the platform isn’t a compromise. It’s a deliberate constraint.

Evan Bottcher’s “What I Talk About When I Talk About Platforms” (2018, on Martin Fowler’s website) established the foundational definition: an internal platform is “a foundation of self-service APIs, tools, services, knowledge, and support which are arranged as a compelling internal product.” TVP refines Bottcher’s definition by adding the sizing constraint.

The CNCF Platforms Working Group formalized the practice in their Platforms Definition whitepaper (2023), describing platforms as curated collections that reduce cognitive load on stream-aligned teams. The whitepaper’s emphasis on curation over comprehensiveness aligns with TVP’s core principle: what you leave out matters as much as what you include.

Encyclopedia of Agentic Coding Patterns