Sensu: monitoring as a service (part 1): Monitoring external services with Sensu Core 1.4 proxy clients

It’s a small world after all

Microservices are everywhere. Many of us are building applications that rely on a mix of them spread throughout the network. Instead of one monolithic server, we’ve broken down the business logic of our application into small, well-defined, and maintainable services communicating with each other over the network.

Microservices, its a small world after all

It’s a world of laughter when everything is working well, but when something goes awry and external service endpoints you rely on become inaccessible, it’s a world of tears…so many tears. Hopefully we’ve built our applications to be robust against service disruptions and to alert us when there’s a problem through application-level monitoring mechanisms (like Sensu!). But Sensu’s mission as a full stack monitoring solution goes further than just your own production application code, giving you a platform to easily monitor external services as well.

I joined Sensu as a Developer Advocate in May of this year, and while coming up to speed on all things Sensu, I’ve learned Sensu provides a very simple way to poll external service availability in the form of a proxy check request and proxy clients. This capability makes it really easy to field external service checks without having to instrument an application codebase directly or make changes to a running server (assuming there is a server) hosting an external service that your application relies on.

In this post, I’ll offer a detailed walk through of a working example using proxy clients. If you just want the technical bits to review later, you can find them at https://github.com/jspaleta/sensu-blog-MAAS-pt1.

Polling an external service via proxy check

To save time, I’m assuming you have a working Sensu Core 1.4 server and client available and can install the sensu-plugins-http plugin to the client. If you don’t, it might be worth taking a quick detour and following the Learn Sensu in 15 Minutes Guide, then jump back here.

Here’s a simple HTTP proxy check configuration to poll the sensu.io website:

{
  "checks": {
    "sensu-website": {
      "command": "check-http.rb -u https://sensu.io",
      "subscribers": [
        "http-proxy"
      ],
      "interval": 60,
      "source": "sensu.io",
      "handler": "http-proxy-slack"
    }
}

With this configuration in place, the Sensu server will register a new proxy client named sensu.io, taken from the value of the source attribute when the check is run on any Sensu client subscribed to http-proxy. The client will run this check, but will send the check result as if it came from a client named sensu.io. The Sensu server will register sensu.io in a just-in-time-manner.

Being able to set the source for a check is really awesome feature when setting up checks. It allows you to organize your view of registered clients in your dashboard into logical buckets of your own devising instead of having all the event alerts coming in from the same client — very useful for keeping external service monitoring (and alert handling) organized.

For completeness, here’s my client config for the running client:

{
 "client": {
   "name": "test-client",
   "address": "127.0.0.1",
   "environment": "development",
   "subscriptions": [
     "http-proxy",
     "roundrobin:proxy-request",
     "dev",
   ],
   "socket": {
     "bind": "127.0.0.1",
     "port": 3030
   }
 }
}

This test-client is subscribed to http-proxy and will get check requests for the sensu-website check. After test-client runs the sensu-website check, here’s the list of clients via the Sensu API clients endpoint:

curl http://127.0.0.1:4567/clients | jq
...
[
  {
    "name": "sensu.io",
    "address": "unknown",
    "subscriptions": [
      "client:sensu.io"
    ],
    "keepalives": false,
    "version": "1.4.1",
    "timestamp": 1526678105,
    "type": "proxy"
  },
  {
    "name": "test-client",
    "address": "127.0.0.1",
    "environment": "development",
    "subscriptions": [
      "http-proxy",
      "roundrobin:proxy-request",
      "dev",
      "client:test-client"
    ],
    "socket": {
      "bind": "127.0.0.1",
      "port": 3030
    },
    "version": "1.4.1",
    "timestamp": 1526678107
  }
]

The Sensu server has automatically registered a proxy client named sensu.io with an unknown address, as part of processing the sensu-website check result.

The upside to this approach is you have full access to the check configuration so you can tailor your check interval and check command with service details such as headers or data to send as part of a POST operation. Just drop in a check definition for each external process you want to monitor, and set the source attribute to the client name you want to use to organize the results. The same configuration pattern can be used beyond HTTP services to check on other network services using alternative check commands, such as those provided by sensu-plugins-network-checks.

The drawback to this approach is that you will need to implement a new configuration management change for each new external service you’d like to monitor. Sensu Core 1.x doesn’t allow you to create checks programmatically via an API call; each new check must be added into the configuration files on the server (though if this capability interests you, you should take Sensu Core 2.0 Beta for a spin).

But you can still do some very cool things with statically defined proxy checks. For example, if you do want to manage simple service availability polling on the fly in Sensu Core 1.x, we can flip the script and register proxy clients dynamically via the Sensu API and reuse a single proxy-request check configuration for those proxy clients to make use of.

Introducing proxy-requests

Here is a slightly more complicated check configuration that can be re-used by multiple proxy clients:

{
  "checks": {
    "check-http-proxy-request": {
      "command": "check-http.rb -u :::url:::",
      "subscribers": [
        "roundrobin:proxy-request"
      ],
      "interval": 60,
      "handler": "http-proxy-slack",
      "proxy_requests": {
        "client_attributes": {
          "url": "eval: defined?(value)",
          "subscriptions": "eval: value.include?('http-proxy-request')"
        }
      }
    }
  }
}

This check configuration introduces two new concepts:

  • First, it uses check token substitution for the URL used in the check command attribute value, substituting the client url attribute value.
  • This check configuration also introduces the proxy_requests attribute, a hash which defines client_attributes, conditions that must be met for the check to run on the client. In this example, the proxy client url attribute must be defined, and the proxy client must be subscribed to http-proxy-request.

The proxy_request attribute is a lovely bit of magic that gets parsed when the Sensu server schedules check requests. When defined in a check, the server will re-issue the same check request to subscribed clients for each client that matches the client_attributes conditions. Essentially, a real Sensu client will end up running a check request on behalf of other registered clients.

Registering proxy clients via the Sensu API

Here’s how it looks in practice — the configuration snippets above have everything we need:

  • The test-client is subscribed to roundrobin:proxy_requests .
  • check-http-proxy-request is configured so that test-client will run this check on behalf of other registered clients.

A small note: the roundrobin: prefix in the subscription tells the Sensu scheduler to run the check on a single client from the pool of subscribers. Once you have many of these proxy clients defined, the roundrobin: scheduling feature will let you easily spread the check request load across a pool of Sensu clients.

Now all you need to do is register new proxy clients with the Sensu server and make sure they have the url and subscriptions attributes set to match what check-http-proxy-request expects. For that, we use the Sensu API directly:

curl -s -i -X POST -H \
'Content-Type: application/json' \
-d '{"name":"proxy_sensu.io", "address":"unknown", "subscriptions":["http-proxy-request"], "environment":"development", "handler":"http-proxy-slack", "url":"https://sensu.io"}' \
http://127.0.0.1:4567/clients

The above cURL command will register a new client named proxy_sensu.io, subscribe the client to http-proxy-request, and set the additional url attribute that we need to match the proxy requests conditionals we set for the check. You’ll notice I set a few more attributes — I’ll cover some of those shortly.

Here’s my new client list as reported via the Sensu API:

[
  {
    "name": "sensu.io",
    "address": "unknown",
    "subscriptions": [
      "client:sensu.io"
    ],
    "keepalives": false,
    "version": "1.4.1",
    "timestamp": 1526678105,
    "type": "proxy"
  },
  {
    "name": "proxy_sensu.io",
    "address": "unknown",
    "subscriptions": [
      "http-proxy-request",
      "client:proxy_sensu.io"
    ],
    "environment": "development",
    "handler": "http-proxy-slack",
    "url": "https://sensu.io",
    "keepalives": false,
    "version": "1.4.1",
    "timestamp": 1526678278
  },
  {
    "name": "test-client",
    "address": "127.0.0.1",
    "environment": "development",
    "subscriptions": [
      "http-proxy",
      "roundrobin:proxy-request",
      "dev",
      "client:test-client"
    ],
    "socket": {
      "bind": "127.0.0.1",
      "port": 3030
    },
    "version": "1.4.1",
    "timestamp": 1526678287
  }
]

There are three clients registered in total now, and the only running client is defined via configuration management as test-client. The other two clients are just definitions that live inside the Sensu data store. sensu.io is a proxy client registered in a just-in-time manner as the result of the sensu-website check. The proxy_sensu.io client is the one we just created via the Sensu API, with attributes set to match the proxy client conditions in the check-http-proxy-request check configuration.

That’s it, everything is in place to have test-client run check-http-proxy-request on behalf of the proxy-sensu.io client. Here’s what the check request looks like in the sensu-server.log:

{
  "timestamp": "2018-05-18T21:22:30.153425+0000",
  "level": "info",
  "message": "publishing check request",
  "payload": {
    "command": "check-http.rb -u https://sensu.io",
    "handler": "http-proxy-slack",
    "proxy_requests": {
      "client_attributes": {
        "url": "eval: defined?(value)",
        "subscriptions": "eval: value.include?('http-proxy-request')"
      }
    },
    "name": "check-http-proxy-request",
    "source": "proxy_sensu.io",
    "issued": 1526678550
  },
  "subscribers": [
    "roundrobin:proxy-request"
  ]
}

The important thing to notice from the check request is that the source attribute and command attribute are set from the proxy_sensu.io client attributes. Here’s the corresponding log entry when test-client receives the check request:

{
  "timestamp": "2018-05-18T21:22:30.153745+0000",
  "level": "info",
  "message": "received check request",
  "check": {
    "command": "check-http.rb -u https://sensu.io",
    "handler": "http-proxy-slack",
    "proxy_requests": {
      "client_attributes": {
        "url": "eval: defined?(value)",
        "subscriptions": "eval: value.include?('http-proxy-request')"
      }
    },
    "name": "check-http-proxy-request",
    "source": "proxy_sensu.io",
    "issued": 1526678550
  }
}

Monitoring-as-a-service is born!!!

To add any additional HTTP services you want to monitor, all you need to do is register a new proxy client with different name and url attribute values via the Sensu API. The example above just scratches the surface of what you can do via check token substitution. Check out the companion repository for a more expressive version of the check that can populate the optional arguments for check-http.rb.

1 VZ0xKmop4EJjoLykZU6v2Q

Diving deeper

Now that I’ve walked you through the simple version, here’s a more advanced version of the same check that makes aggressive use of optional proxy client attributes and Sensu Enterprise features to build a more expressive monitoring service.

{
  "checks": {
    "check-http-proxy-request": {
      "interval": 20,
      "command": "check-http.rb -u :::url:::  :::command_arguments| :::",
      "subscribers": [
        "roundrobin:proxy-request"
      ],
      "occurrences": ":::occurrences|3:::",
      "refresh": ":::refresh|1800:::",
      "handler": ":::handler|debug:::",
      "contact": ":::contact|unknown:::",
      "proxy_requests": {
        "client_attributes": {
          "url": "eval: defined?(value)",
          "subscriptions": "eval: value.include?('http-proxy-request')"
        }
      }
    }
  }
}

I’ll summarize the changes quickly:

  1. Sub-minute check request interval is now set to 20 seconds.
  2. command_arguments optional client attribute to take full power of the check-http.rb command.
  3. occurrences and refresh optional client attributes to be used by the sensu-extension-occurrences extension to mitigate alert fatigue. Default is set to 3 occurrences, before issuing a notice.
  4. handler optional client attribute to set the handler name, defaults to debug.
  5. contact optional client attribute. Useful for Sensu Enterprise users so you can take advantage of the contact routing feature.

Can we be even bigger slackers?

We have checks — let’s route the corresponding check events into a Slack channel being monitored by your army of Turing complete Slackbots (or if that’s not available, real live humans). The check configurations I’ve used so far have a Slack handler set for this very purpose. We can now define that Slack handler in the scope of the Sensu Core configuration management.

Grab the sensu-plugins-slack gem and make sure it’s available for the Sensu server to use. You’ll need to add a Slack configuration file with the Slack webhook_url attribute defined:

{
  "http-proxy-slack": {
    "webhook_url": "https://hooks.slack.com/services/blah/blah"
  }
}

If you aren’t sure what the webhook_url value should be, jump over to the Slack API documentation and read up on the incoming webhooks. You’ll want to generate an appropriate application webhook_url scoped to the channel you’d like to route event alerts to.

Now we just have to define the http-proxy-slack handler, as used in the proxy-sensu.io client definition. Here is a basic Slack handler config:

{
  "handlers": {
    "http-proxy-slack": {
      "type": "pipe",
      "command": "handler-slack.rb -j http-proxy-slack"
    }
  }
}

Note: I’m using the optional -j argument to explicitly set which configuration scope to read the webhook_url from. If you want to route different proxy client checks to different Slack channels in Sensu Core using handler-slack.rb, you’ll need to create different config/handler named pair for each channel and use the client “handler” attribute to indicate which route to use.

Sensu Enterprise users have it a little easier with the integrated Slack support and integrated contact routing. For Sensu Enterprise users, the contact definition will hold the necessary Slack webhook_url that overrides the default Slack handler configuration. No need to build multiple config/handler pairings, just populate the contact information and choose the appropriate integrated handler that comes with Sensu Enterprise, easy-peasy.

Let’s recap what we have

  1. A reusable check definition that will publish check requests for matching proxy-clients to subscribed Sensu clients in a round-robin fashion.
  2. A REST endpoint where we can add/remove proxy client definitions on demand.
  3. A way to route check event notifications to specific contacts for each proxy check for a variety of communication channels via handler configuration. (For Sensu Enterprise users, this is done through the integrated contact routing feature.)

It’s starting to feel like the beginning an intra-organizational service to allow different teams to set up service endpoint monitoring without fear of disrupting other critical monitoring tasks, and without having to access Sensu configuration. Wrap a thin web app around the Sensu client API to provide teams an easy to use web form to set up proxy client attributes, and you are 90% of the way to self-servicing service monitoring inside your org.

Next time: building an even better service with Sensu Core 2.x

Next time I’ll show you how to use the new role-based access control (RBAC) of Sensu Core 2.x (in Beta now!) to build an even better self-servicing monitoring solution inside your org.