|Nagios 3: Enterprise Network Monitoring|
|author||Max Schubert, Derrick Bennett, Jonathan Gines, Andrew Hay, John Strand|
|summary||Making Nagios 3 work for you and your business.|
Chapters 2 and 3 deal with scaling Nagios to work efficiently within large deployments. First, designing a Nagios configuration for large organizations is shown. This is something that all Nagios administrators should make use of when designing configurations, not only administrators in large organizations, because a properly done configuration for a small organization will easily scale up as the organization grows. I was impressed to see that the authors stress the importance of the end user's input when designing configurations. Administrators who ignore this piece of advice risk the success of Nagios in their organization. Various diagrams help to explain the relationships between the various Nagios configuration objects. A good amount of detail is provided regarding allowing various groups within an organization to have semi-independent control over how Nagios interacts with their hosts and services, and how Nagios alerts their staff. The authors have included numerous configuration file snippets, which allows a Nagios administrator to very quickly create a configuration file and then tweak the configuration parameters to suit local requirements.
Scaling the Nagios graphical user interface (GUI) follows a very simple concept: use a "less is more" approach. Although the specific details here deal with Nagios, the general idea is equally applicable to anyone displaying information they expect their users to actually pay attention to. In general, users should be able to see as much as they want (limited by resources and permissions) but only be shown what they need to know about by default. For example, the system administrator for marketing probably does not need to know when the development disk image server goes down, while the development system administrator would probably be very interested. Utilizing user accounts allows the administrator to allow various groups to have access to Nagios filtered by its fine-grained permissions system. Users from various groups can also be shown only what they need to be shown by default, without the need to select a particular area first. Utilizing user accounts also prevents users who need to view Nagios from having full administrative control, and allows for records of each user's actions to be made. Using a patch provided with the book's download package will enable Nagios to have read-only accounts as well, which is great for organizations who would like to grant certain users (or groups) access to view Nagios but not make any changes. As an example, an organization's help desk could use Nagios to determine quickly whether users are unable to access services because of an outage, or if further troubleshooting is necessary.
The authors continue on here to discuss clustering, failover, and the future of the Nagios GUI. I'm not convinced that these belong in a chapter devoted to scaling the Nagios GUI, since these seem to mostly deal with scaling the entire Nagios deployment. Regardless, they are all very important topics, especially when Nagios is heavily relied upon. Clustering allows remote sites to have a Nagios instance local to the site monitoring hosts and devices rather than requiring a central Nagios instance to monitor remote hosts and services. Not only would monitoring hosts and services take much longer due to the WAN links between the central instance and remote locations, but also due to the security implications of allowing the checks to be done. The authors don't discuss the security side of clustering, but it's still something that every Nagios administrator (and everyone else!) should keep in mind. The clustering section deals primarily with the rationale behind clustering and how to configure the local and remote instances of Nagios properly, but the authors include a good deal of information here that a less experienced Nagios administrator might overlook. Most notable is their discussion about the display of service status when a service is reachable from the master server but not from a remote instance. While Nagios can translate the remote instance's check result to be displayed from its own perspective, it may be more desirable to have the master Nagios GUI display the results from the perspective of the server which made the check. After implementing clustering, some sort of fallback mechanism is required. Failover and redundancy are the two main choices, and that's what the authors discuss next. They don't spend much time on redundancy, since this would require each redundant Nagios instance to perform its own set of checks, which can significantly raise the load on both the monitored hosts and the network in general. Given the problems it can introduce, the authors have spent more time on redundancy than most administrators should spend considering. Failover is a much better solution, and the authors do a great job of covering the setup of a proper failover setup. As usual, they make sure to remind readers of some things that are easily overlooked, especially when you're trying to get Nagios back up and running when the master server crashes.
Chapters 4 and 5 discuss Nagios plug-ins, add-ons, and enhancements. These chapters alone are worth the price of the book because of how much time they can save. It's much faster to copy a script and make minor tweaks than it is to try reinventing the wheel, and with the number of scenarios covered here combined with the Nagios user community there aren't very many things that haven't been done already. Whether you want to test command-line interfaces, CPU usage, memory utilization, bandwidth utilization, HTML pages, LDAP services, or even specialized hardware, there's probably already a plug-in written for it. Most common scenarios actually have a plug-in already included in this book. The available add-ons and plug-ins are equally varied, providing ways to monitor hosts across security zones, configure read-only displays that live in a security zone other than the one Nagios is in, interface with Cacti, and even read out alerts. Even more scenarios can be handled by other scripts provided by the Nagios community.
Chapter 6 goes into detail on how to integrate Nagios into an enterprise environment. This chapter goes into just enough detail to get Nagios configured to work with a large number of third-party services, such as LDAP authentication, Cacti, Puppet, and Splunk. Emphasis here is always placed on the human element; how to use Nagios to help help desk and/or NOC staff do their jobs more efficiently and effectively, and how to gain maximum support for Nagios within the organization. The importance of the human element, in all its forms, simply cannot be overstated, and the the authors have done a wonderful job of outlining a good way to make Nagios an integral part of an organization. A lot of the material towards the end of the chapter, especially the section on smaller Network Operation Centres, could be used by anyone looking for ways to help a small group work together effectively.
Chapter 7 is another chapter with a lot of content easily applicable outside of a Nagios environment. The chapter begins with the authors reminding you to know your network and to watch out for session hijack attacks, then show you how to use Nagios to do both. Nagios can't replace a competent network administrator, but it can make their lives easier and the authors show you here how the configuration you've already done on Nagios already shows you a potential session hijack attack and how it forces you to properly know your network. Nagios forces you to know your network not only by how it's built and by what devices are in use, but it also requires that you have a solid handle on what constitutes normal conditions for all your devices and services.
Another area which is very important to companies, especially companies operating in the United States, that Nagios can assist with is regulatory compliance. The authors outline how a company could use Nagios to assist with compliance with Sarbanes-Oxley (SOX) with COBIT or COSO, Payment Card Industry (PCI) Data Security Standard (DSS), Director of Central Intelligence Directive (DCID) 6/3 and Department of Defence (DoD) Information Assurance Certification and Accreditation Process (DIACAP). Nagios alone isn't enough to be compliant, at the very least detailed documentation will also be required, but the authors give a good overview of how Nagios can assist with compliance in all of these regulations.
The final chapter helps to bring the rest of the book together by walking through a full Nagios configuration for a fictional Fortune 500 corporation. The bulk of this chapter covers the pre-deployment stage of a Nagios deployment, but that doesn't mean that there isn't a lot to learn about deploying Nagios. A major hurdle towards deploying Nagios in an organization is the pre-deployment phase, and the authors outline here how to easily turn this major challenge into a series of simple steps to increase the chances of Nagios' success in your organization. From the very beginning, you can see how involving the customer early and starting small, along with everything else, becomes a part of a process. Although it's specific to Nagios, the process followed here could be easily adapted to integrating any sort of monitoring service. The remainder of the chapter is devoted to how you might integrate Nagios into a Fortune 500 company, finishing the book off with some good advice for integrating Nagios.
Despite all the book's strengths, there is some room for improvement. In chapter 2, it may have been more effective to outline the relationships between the Nagios configuration objects before discussing configuration planning. I found it much easier to think of a configuration for a large organization after knowing about how Nagios' configuration objects relate to each other.
Throughout the book, the authors have included configuration file snippets, scripts, and example script output in the main text. While all of these are quite useful and serve to enhance the book, I think it would have been better if these were all included in an appendix instead, perhaps keeping only the relevant parts of configuration snippets in the main text for clarification.
At the end of chapter 3, the sections on the future of Nagios and the CGI front end are informational and interesting, but they would be better placed in a separate chapter dealing with the potential future of Nagios in general. These and the other major areas of Nagios combined would provide more than enough material for a full chapter on their collective futures.
Overall, this is a great book for anyone using Nagios as more than a casual user, and is still very informative for the casual user. A few of these chapters alone would be worth the price of the whole book.
Disclaimer: I worked with one author when I was asked to review this book.
You can purchase Nagios 3: Enterprise Network Monitoring from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.