Heightened visibility & deeper control with a monitoring control plane

This post was originally published in TFIR.

Until a few years ago, if you did any kind of searching for control planes, you would have found results related to traditional networking concepts. With the advent of cloud computing — including hybrid cloud, multi-cloud, and cloud-native — we’re seeing a lot of tools starting to adopt a “control plane for ‘X’” terminology. We’ve heard this term applied to — among other things — Kubernetes. More on that later.

Photo by Nacho Rochon on Unsplash

At a high level, a telecommunications architecture contains three basic components: the control plane, the data plane, and the management plane. According to TechTarget, “The control plane is the part of a network that carries signaling traffic and is responsible for routing. Control packets originate from or are destined for a router. Functions of the control plane include system configuration and management” (emphasis mine). They go on to describe the management plane as a subset of the control plane and all three planes working in tandem enable programmatic access and — therefore — more flexibility in your organization. So from that, we can derive the five important attributes of a control plane:

  • Routing
  • Configuration
  • Management
  • Programmatic access
  • Flexibility

The control plane is a centralized management interface. For a technology to be deemed a control plane, it must manifest all five of these attributes. In this post, I’ll present an argument for the role of a control plane in monitoring — telling the story of how our users’ need for deep visibility into their applications, and control over their monitoring led us to create a product that provides a monitoring control plane. But first, let’s take a look at one of the best modern examples for control planes.

The Kubernetes control plane

We’ve long been relying on microservice-based architecture to ship software faster and more safely. Containerization was the natural next step in that evolution and — in conjunction with Kubernetes for container orchestration — has reshaped how we build and deploy applications. Kubernetes has made it possible for teams to manage their containerized infrastructure, acting as a common, powerful platform for deploying your applications wherever they run. But, that kind of power comes with its own challenges (not just when it comes to monitoring); container orchestration is an inherently complex problem, and requires a control plane like Kubernetes to operate.

According to Kubernetes.io, the “various parts of the Kubernetes control plane, such as the Kubernetes Master and kubelet processes, govern how Kubernetes communicates with your cluster.” As the name implies, it controls how Kubernetes interacts with your application; it’s responsible for managing the worker nodes, making scheduling decisions and making necessary changes to ensure the cluster gets to a desired state. I first heard Kubernetes referred to as a control plane from Kelsey Hightower, and it immediately clicked — Kubernetes is that source of truth for your entire containerized infrastructure.

A control plane allows you to get your arms around a very complex system. There is a trending debate regarding the complexity of Kubernetes (which I won’t comment on here), but regardless of which side you take in that discussion, we must acknowledge that this complexity is relative to the systems we’re building. Yes, there’s a lot to learn in Kubernetes, but once you do learn it, it actually gives you the tools to codify a very complex system in a portable configuration template, that you can then automatically deploy over and over again.

For the same reason that control planes worked so well as a traditional network concept, and now as a compelling approach in managing complex microservice architectures, we are also seeing control planes applied to other problem spaces. For example, Armon Dadgar, CTO and co-founder of HashiCorp, describes service meshes (such as Consul) as the solution to facilitate communication between disparate components of a system, with the control plane as the central management interface and a key layer within the service mesh:

As our systems and applications continue to increase in complexity, it’s even more critical that we apply this concept of the control plane to the other toolsets and practices in our systems.

The monitoring control plane

When Sean Porter (Sensu Creator, Co-Founder, and CTO) and I were reimagining Sensu, it became clear that our users needed not just another monitoring tool, but a way to consolidate their existing toolset behind a holistic centralized management interface. They also needed a flexible solution for monitoring ephemeral infrastructure — not only including Kubernetes, but also public cloud platforms like AWS (EC2, ECS, EKS), Google Cloud Platform (GCE, GKE), Azure (AVM, AKS), Heroku (Dynos), and more. With those needs in mind, we deliberately architected the next version of our product to emulate the Kubernetes approach. Said another way, we took the model of Kubernetes as a control plane and applied it to monitoring.

Here’s how it works: a monitoring control plane is the orchestration layer for your monitoring solution – including a data layer for collecting and transporting monitoring data, and a management layer for deciding what to do with incoming monitoring data –in the same way a network routing control plane routes network traffic. Monitoring control plane functions include managing the data collection process, routing incoming monitoring data to the corresponding processor via compatible interfaces, and discarding monitoring data that is of little or no operational value. A monitoring control plane also provides programmatic access to configure and manage the system via documented APIs, which greatly increases the flexibility of the solution, allowing you to customize monitoring to your team’s needs.

Internally, a monitoring control plane is configured via declarative templates that define and register the various components of your monitoring infrastructure, including collectors (in Sensu, we call these “checks”), available processors (in Sensu, we call these “handlers”), remediation actions, and rules or policies (which we call “filters”) that control the routing of data to those processors. Templates can be simple definitions for individual components of the monitoring control plane or more complex definitions that configure end-to-end monitoring workflows.

So, let’s return to our five important attributes of a control plane, and see how they line up with a monitoring control plane:

  • Routing: monitoring data transported over the data later, processed by the control plane
  • Configuration via declarative templates
  • Management: scheduler & data processing layer
  • Programmatic access: API
  • Flexibility: extensibility & custom workflows

The monitoring control plane in practice

We’ve seen the need arise — particularly among enterprises with sprawling, multi-generational infrastructure — for a comprehensive approach to monitoring. At enterprise scale, it’s often not possible to achieve this via a single monitoring tool or product, so the solution is all about composability – use best-in-class tools for each individual component of the system. But this creates a new problem – how do you provide a unified interface for management of such a complex system? We think the answer is a monitoring control plane. We have seen companies spend years building custom observability pipelines with rudimentary management interfaces, typically resulting in a technically impressive conference talk or blog post. While they are successful in solving their unique challenges, the results are rarely reusable solutions outside of the teams that invented them.

Here are just a few interesting examples from some of our favorite conference talks. If we had more time, we could build a gallery of hundreds or thousands of examples. But we probably don’t need to. You’ve seen these architecture diagrams before, and you’ll see them again. And if you look closely, you’ll see the pattern – they all manifest the attributes of a control plane: routing, configuration, management, programmatic access, and flexibility.

We’ve also seen our customers use Sensu as a centralized management interface, long before we acknowledged this use case and began developing first-class features to position Sensu as a comprehensive “monitoring control plane” solution. For example, check out this diagram from a talk by David Beaurpere, Principal Software Engineer @ Workday:

For more information, check out David’s talk from the 2018 Sensu Summit, in which he discusses Workday’s Sensu implementation, leveraging the data layer (monitoring event pipeline) and management layer to build their own monitoring control plane.

So, what did we learn? Control planes are becoming increasingly common as a means to manage the growing complexity of modern software systems. We looked at the example of Kubernetes as a control plane for container orchestration, it provides a centralized management interface for highly complex distributed systems — AKA, the new normal of modern infrastructure. If we apply these principles to monitoring, we can also manage highly complex observability solutions in an automated and repeatable fashion.