Tales of scale, multi-gen infrastructure, & monitoring Kubernetes at Sensu Summit 2018

Sensu Summit 2018 was a two-day extravaganza, with keynotes, workshops, donuts, and all the monitoring love.

Sensu Summit 2018 group A great group of faces at Sensu Summit 2018.

We got together at the Portland Art Museum August 22-23 to hear about Kubernetes, customer stories of scale and legacy infrastructure, monitoring how-tos, and workflow automation for monitoring. Sensu CEO Caleb Hailey kicked off the event, welcoming everyone to the conference and hinting toward what’s in store. CTO Sean Porter emphasized that ephemeral infrastructure is the new normal — in turn, we have to rethink our approach to monitoring to include both ephemeral and multi-generational infrastructures. As Sean noted, “We believe in supporting established platforms.”

How Sensu wins, according to Sean, is with the community: as domain knowledge experts, the community knows how to operate these systems. Together, we can embrace the idea of composing these systems and share the community’s insights.

Next up was a fireside chat with Caleb, Sean, and Kelsey Hightower, Staff Developer Advocate at Google Cloud Platform. They discussed the hybrid cloud reality we live in today, and how monitoring fits into new approaches like ephemeral and serverless infrastructures. Kelsey championed the role of systems administrators, especially in terms of how we approach new technology.

(For a deeper dive, Caleb and Kelsey continued the conversation with The New Stack at lunch — listen to the podcast here.)

We also heard stories of scale — like Trent Baker of Box.com on how they migrated 350,000 Nagios objects to Sensu, allowing them to scale horizontally, and Workday’s David Beaurpere on how, with their old solution, they had “way too many alerts” that no one listened to. According to David, “Sensu is so crucial and relevant” because it allows Workday to “herd cats and catch fire,” AKA, Sensu enables monitoring for heterogeneous systems. Continuing in that vein, we also heard from Christopher J. Caillouet of Industrial Light & Magic who — in addition to showing us some amazing behind-the-scenes looks at how movies are made — emphasized the value of legacy systems.

For ILM, “legacy is legend:” there are so many technologies (both legacy and cloud-based) that go into making a movie, so the team needed a monitoring solution that was open source, scalable, offered a flexible toolkit, and wasn’t a single point of failure.

After lunch, we heard from a few more Sensu folks, including Customer Support Engineer Aaron Sachs on how to build and monitor your own Kegerator with Sensu + Raspberry Pi (plus some of the challenges he encountered). SVP of Engineering Greg Poirier walked us through assets in 2.0 with some excellent hand-drawn slides, highlighting how Sensu enables workflow automation that’s unique to individual infrastructure and business process. He also stressed that with 2.0, there will be less configuration management, and more Sensu, helping reduce operational burden and complexity.

We continued into the afternoon with T-Mobile’s Chris Chandler, who showed us how to monitor without having to deploy clients (plus some use cases for why’d you want to — e.g., clientless monitoring is 100% technology agnostic, so all you need to do is ship a check result to Sensu’s API). Apptio’s Lee Briggs taught us how to monitor Kubernetes with Sensu 1.x, calling out Sensu’s flexibility when it comes to an organization transitioning to new technology.

Julian Dunn and Fletcher Nichol of Chef used some excellent real-world architecture examples to illustrate microservices architecture, and demoed Sensu running under Habitat.

Caleb Hailey closed out day 1 with a tale of the evolution of Sensu’s messaging — and how workflow automation for monitoring helps empower our users. (For more on this approach, check out Caleb’s article in The New Stack.)

We kicked off day 2 with made-to-order Pip’s Donuts, followed by talks, workshops, and unconference sessions. Paul Reed of Release Engineering Approaches examined what we can learn from failure — as illustrated by less-than-successful plane landings — and warned against the dangers of blameless postmortems. Doximity’s Ben Abrams shared his advice for getting rid of alert fatigue (don’t alert on something that’s not actionable, realize that criticality isn’t the same as urgency, and empower those building the product, to name a few) with some tips for fine-tuning Sensu. Garrett Honeycutt of GH Solutions walked us through the Sensu + Puppet module, taking a look at the difference between Sensu 1.0 and 2.0. Bonus: there’s a Puppet module for 2.0 (with a Vagrant box) that you can grab right now. Next, we heard from David Schroeder, who discussed Viasat’s setup, highlighting how they use Ansible to configure and deploy Sensu for multiple teams.

The remainder of day 2 was devoted to community unconference sessions, with workshops on Sensu 2.0, contributing to Sensu docs, and tutorials from our friends at InfluxData and Grafana. We closed out the day with a high-spirited lightning talk from Pivotal’s Paul Czarkowski, who took us through the evolution of DevOps (or: devops, or: Devops).

Since this was in fact a community unconference, VP of Community Matt Broberg invited folks to capture and share their experiences.

We’ve included the videos and links to slides below. Thanks to everyone who attended, and see you next year!

7 Years of Sensu: Then, Now, and Soon | Sean Porter, Sensu

Slides

Fireside chat with Kelsey Hightower, Sean Porter, and Caleb Hailey

The Box.com success story: migrating 350K Nagios objects to Sensu
Trent Baker, Box.com

Slides

Herding Cats & Catching Fire | David Beaurpere, Workday

Slides

Project 3M: Meaningful Monitoring and Messaging
Christopher J. Caillouet, Industrial Light & Magic

Slides

Where’s My Beer: Building a Better Kegerator with a Raspberry Pi & Sensu Aaron Sachs, Sensu

Slides

Clientless Monitoring with Sensu | Chris Chandler, T-Mobile

Slides

Assets in Sensu 2.0 | Greg Poirier, Sensu

Slides

Sensu and Kubernetes 1.x | Lee Briggs, Apptio

Slides

Pull, don’t push: Architectures for monitoring and configuration in a microservices era | Julian Dunn & Fletcher Nichol, Chef

Slides

Reimagining Sensu | Caleb Hailey, Sensu

Slides

Failure as Success: the mindset, the methods, and the landmines Paul Reed, Release Engineering Approaches

Alert Fatigue: Avoidance and Course Correction | Ben Abrams, Doximity

Slides

Sensu and Puppet | Garrett Honeycutt, GH Solutions

Slides

Sharing Sensu with multiple teams using Ansible | David Schroeder, Viasat

Slides

Lightning talk: DevOps is dead and servers are dying Paul Czarkowski, Pivotal