Fireside chat with Kelsey Hightower, part one: on Kubernetes, legacy tech, and the future of monitoring

At Sensu Summit 2018, Sensu CEO Caleb Hailey and CTO Sean Porter sat down with Kelsey Hightower, Staff Developer Advocate at Google Cloud Platform (GCP), for a fireside chat on a variety of topics, including the evolution of the monitoring space, Kubernetes best practices, their opinions on an open core business model, how operators’ jobs are changing, and more.

Kelsey, Sean, and Caleb discussing all things Kubernetes and open source at Sensu Summit 2018

If you’re familiar with Kelsey, then you already know that he’s an open source champion and Kubernetes expert (see kubernetes-the-hard-way), and takes a refreshing stance on what it means to be an “expert:” He won’t talk about technology that he isn’t actively using, and maintains that if a new technology isn’t right for you, you simply shouldn’t use it.

Their discussion is worth a watch (find it here, along with lots of other great talks from Sensu Summit 2018), and since they covered a lot of ground, I’d like to share some of the key takeaways in this two-part series.

One of the primary motivations that connects Kelsey, Sean, and Caleb is their passion for making operators’ lives easier. The right tools allow you to do more than keep up with the rapid pace of change in technology — they empower you to come up with realistic (and actionable) solutions to complex problems, and to bring real value to both your company and career.

Let’s dive in.

The evolution of monitoring — and addressing the challenge of legacy systems

They opened the discussion talking about the evolution of monitoring tools and how Kubernetes and Sensu fit together.

Monitoring has come a long way since the introduction of traditional solutions like Nagios. Gone are the days of black box monitoring where you just run a check from the outside and expect a complete picture of application health.

Now, lots of tools have monitoring built into them. For example, Kubernetes has liveness and readiness probes that help protect the health of Kubernetes pods when updating deployments.

Kelsey explained that using Sensu to monitor Kubernetes enables a more sophisticated level of monitoring: using the sidecar pattern, you’re able to pass metrics about all the things running in Kubernetes to Sensu, ensuring the same execution context across both, or you can use the push-pull method using Sensu’s API. (For a deep dive on monitoring Kubernetes, check out Sean’s series.)

Pulling this data out of Kubernetes can be useful on its own, but the real value lies in what you do with it next, like gaining deeper visibility into your infrastructure and applications.

Kelsey mentioned that monitoring tools can be used like policy engines — to handle alerts and dictate an automated response — which is something we’ve seen from the Sensu Community, too (see this post from Community Maintainer Ben Abrams on automating triage and remediation to cut down on alert fatigue).

Auto-remediation, or policy through configuration, is one of the ways to make use of Sensu workflows. Getting data passed from legacy tools to modern tools in a standard format is a major part of creating a monitoring workflow. (For more on workflow automation for monitoring, check out Caleb’s blog post.)

Later on, they returned to the issue of integrating emerging technologies with legacy infrastructure. A community member whose company recently initiated a DevOps transformation asked how to approach monitoring legacy systems, given that his team uses Prometheus endpoints to monitor their modern infrastructure.

Sean responded by mentioning the Sensu Prometheus Collector, a Sensu plugin that automatically scrapes HTTP endpoints, filters out the data you need, puts it through the pipeline, and stores it in the TSDB of your choosing.

Where this starts to get really exciting is when we think about taking that pattern of workflow automation for monitoring and extending it across the entire stack. Because Sensu is scalable and extensible, it doesn’t matter if you’re dealing with legacy systems or newer technologies — Sensu is designed to collect and filter data from the underlying infrastructure, and the apps and services running on it, all into the same pipeline.

🤔 Containerize all the things (?)

Earlier Caleb had asked Kelsey to share his perspective on best practices for managing infrastructure in a containerized world. Given the ubiquity of container- and cloud-based infrastructure — in parallel with old tech that isn’t going away — have any best practices for Kubernetes emerged?

Kelsey wisely noted that every platform has a set of constraints and tradeoffs; best practices rise out of limitations as much as they do out of feature sets.

For example, Kubernetes has a set of primitives that you can string together to express your compute pattern. Kubernetes takes those standards and turns them into defaults — that’s where namespaces and volumes come from. When Kubernetes learns something from the community, they roll it back into the project. In that sense, best practices begin to emerge.

Still, as Kelsey said, “You can’t solve every problem reliably for everyone in the same way.”

And it’s probably not worth your time to try to solve every problem using Kubernetes. He gave the example of running a database on Kubernetes. Kubernetes actually works a lot like a hypervisor, except it doesn’t allow static IPs. Kelsey said you’d have to unwind some of the things Kubernetes does. You’d tell it: “Don’t reschedule this particular container anywhere else.” Or, “Net equals host. Don’t give me a dynamic IP.”

They agreed that, when you’re taught Kubernetes (or any tool for that matter), you probably aren’t taught these kinds of nuances. So it might be tempting to try to do everything you can just because the tool is new and shiny, or maybe you’ve been given a top-down mandate from bosses or other departments to do something with the tool, and it’s out of your hands.

But just because you can do something with a tool doesn’t mean you should.

Kelsey broke it down like so: “I wish people knew that a container is just a tarball with a specific layout.” Could you put any application into a tarball? Kelsey said the technical answer is yes. So you’d have a tarball with some metadata, you could push it to a system, open it up and run what’s inside. That might sound good, he said, but that’s what your kernel already does.

If you think about it from this perspective, containerizing all the things misses the point.

Kelsey advised taking a step back and getting a command of what Kubernetes can do, coupled with what you’re trying to achieve, and more importantly, why you’re trying to achieve it.

He said if you see an obvious path, don’t feel compelled to take it just because it’s easy and obvious. The more obvious path might be to opt out of doing something to make room for work that’s higher priority and more impactful.

Next up: evolving as an operator (and as a business)

In my next post, I’ll continue to recap their fireside chat, revisiting the themes of impact and value within the context of business (and pricing) decisions as well as from the standpoint of individual contributors.

Regardless of your role in an organization, you face important decisions when you encounter emerging technologies. Should you be intimidated by new tools, particularly if they disrupt existing ways of working? How can you bring greater value to your organization?

Stay tuned.