Box (Nasdaq:BOX) offers a hosted storage service and headquartered in Redwood City California. Box been using Sensu successfully for several years across multiple datacenters for monitoring over 30k hosts. In this case study, Trent Baker, Senior SRE at Box describes their initial journey of replacing their legacy Nagios with Sensu across the enterprise.
The IT environment included authentication, multiple DNS solutions (bind, infoblox), configuration management (Puppet and Chef) and storage supporting over 80,000 paid enterprise users and 11 million individual users.
The IT team was struggling to scale their existing Nagios deployment. They had deployed multiple Nagios slaves, and using Puppet as the service discovery tool which took hours and multiple puppet runs to propagate through their infrastructure. Any changes or decommissioning a server took hours to propagate through the system and any error caused alert storms.
Having such a fragile monitoring platform was preventing the team from progressing towards a modern cloud environment.
Box replaced Nagios with two Sensu HA clusters in their datacenters. Sensu was designed to integrate seamlessly with configuration management tools such as Puppet, Chef & Ansible. All the native Nagios checks could be used as-is in their new Sensu deployment and all the Nagios classes mapped to Sensu subscriptions, saving Box precious time and resources.
Sensu was very easy to deploy and they rolled it out to 5 different datacenters. It scaled horizontally to their entire infrastructure and they were able to use Sensu’s Wavefront integration to store metrics using Sensu as a pipeline. Additionally, the APIs allowed using Sensu as a trigger for auto-remediation of outages.
In the end, being able to have a modern and flexible monitoring platform such as Sensu allowed Box to look ahead to modernizing their IT infrastructure to use containers & hybrid cloud environments.