Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Image

Nagios 3 Enterprise Network Monitoring 147

jgoguen writes "Nagios, originally known as Netsaint, has been a long-time favourite for network and device monitoring due to its flexibility, ease of use, and efficiency. Nagios provided, and still provides today, a low-cost, versatile alternative to commercial network monitoring applications. Nagios 3 takes a huge step forward compared to Nagios 2, providing improved flexibility, ease of use and extensibility, all while also making significant performance enhancements. Due to its extensibility and ease of use, no device or situation has yet been found that cannot be monitored using Nagios and a pre-made or custom script, plug-in or enhancement." Read on for the rest of jgoguen's review.
Nagios 3: Enterprise Network Monitoring
author Max Schubert, Derrick Bennett, Jonathan Gines, Andrew Hay, John Strand
pages 339
publisher Syngress
rating 8
reviewer jgoguen
ISBN 978-1-59749-267-6
summary Making Nagios 3 work for you and your business.
The first chapter is devoted to new features in Nagios 3. The major changes implemented for Nagios 3, which includes changes to data storage options and locations, checks, configuration objects, and macros, are discussed here. Operational, performance, and usability enhancements are also discussed here. Users upgrading from Nagios 2, or users who may already be familiar with Nagios 2, will gain the most from this chapter. New users will still gain value from this chapter, however, since a number of changes also involve some of the major features of Nagios. In addition, users who may be referring to configuration file samples created for Nagios 2 will save a great deal of time referring to this chapter for changes. Using Nagios 2 configuration files directly prevents users from enjoying some new features of Nagios 3. Users who will only be writing plug-ins and scripts for their local Nagios deployment might not find Chapter 1 very useful.

Chapters 2 and 3 deal with scaling Nagios to work efficiently within large deployments. First, designing a Nagios configuration for large organizations is shown. This is something that all Nagios administrators should make use of when designing configurations, not only administrators in large organizations, because a properly done configuration for a small organization will easily scale up as the organization grows. I was impressed to see that the authors stress the importance of the end user's input when designing configurations. Administrators who ignore this piece of advice risk the success of Nagios in their organization. Various diagrams help to explain the relationships between the various Nagios configuration objects. A good amount of detail is provided regarding allowing various groups within an organization to have semi-independent control over how Nagios interacts with their hosts and services, and how Nagios alerts their staff. The authors have included numerous configuration file snippets, which allows a Nagios administrator to very quickly create a configuration file and then tweak the configuration parameters to suit local requirements.

Scaling the Nagios graphical user interface (GUI) follows a very simple concept: use a "less is more" approach. Although the specific details here deal with Nagios, the general idea is equally applicable to anyone displaying information they expect their users to actually pay attention to. In general, users should be able to see as much as they want (limited by resources and permissions) but only be shown what they need to know about by default. For example, the system administrator for marketing probably does not need to know when the development disk image server goes down, while the development system administrator would probably be very interested. Utilizing user accounts allows the administrator to allow various groups to have access to Nagios filtered by its fine-grained permissions system. Users from various groups can also be shown only what they need to be shown by default, without the need to select a particular area first. Utilizing user accounts also prevents users who need to view Nagios from having full administrative control, and allows for records of each user's actions to be made. Using a patch provided with the book's download package will enable Nagios to have read-only accounts as well, which is great for organizations who would like to grant certain users (or groups) access to view Nagios but not make any changes. As an example, an organization's help desk could use Nagios to determine quickly whether users are unable to access services because of an outage, or if further troubleshooting is necessary.

The authors continue on here to discuss clustering, failover, and the future of the Nagios GUI. I'm not convinced that these belong in a chapter devoted to scaling the Nagios GUI, since these seem to mostly deal with scaling the entire Nagios deployment. Regardless, they are all very important topics, especially when Nagios is heavily relied upon. Clustering allows remote sites to have a Nagios instance local to the site monitoring hosts and devices rather than requiring a central Nagios instance to monitor remote hosts and services. Not only would monitoring hosts and services take much longer due to the WAN links between the central instance and remote locations, but also due to the security implications of allowing the checks to be done. The authors don't discuss the security side of clustering, but it's still something that every Nagios administrator (and everyone else!) should keep in mind. The clustering section deals primarily with the rationale behind clustering and how to configure the local and remote instances of Nagios properly, but the authors include a good deal of information here that a less experienced Nagios administrator might overlook. Most notable is their discussion about the display of service status when a service is reachable from the master server but not from a remote instance. While Nagios can translate the remote instance's check result to be displayed from its own perspective, it may be more desirable to have the master Nagios GUI display the results from the perspective of the server which made the check. After implementing clustering, some sort of fallback mechanism is required. Failover and redundancy are the two main choices, and that's what the authors discuss next. They don't spend much time on redundancy, since this would require each redundant Nagios instance to perform its own set of checks, which can significantly raise the load on both the monitored hosts and the network in general. Given the problems it can introduce, the authors have spent more time on redundancy than most administrators should spend considering. Failover is a much better solution, and the authors do a great job of covering the setup of a proper failover setup. As usual, they make sure to remind readers of some things that are easily overlooked, especially when you're trying to get Nagios back up and running when the master server crashes.

Chapters 4 and 5 discuss Nagios plug-ins, add-ons, and enhancements. These chapters alone are worth the price of the book because of how much time they can save. It's much faster to copy a script and make minor tweaks than it is to try reinventing the wheel, and with the number of scenarios covered here combined with the Nagios user community there aren't very many things that haven't been done already. Whether you want to test command-line interfaces, CPU usage, memory utilization, bandwidth utilization, HTML pages, LDAP services, or even specialized hardware, there's probably already a plug-in written for it. Most common scenarios actually have a plug-in already included in this book. The available add-ons and plug-ins are equally varied, providing ways to monitor hosts across security zones, configure read-only displays that live in a security zone other than the one Nagios is in, interface with Cacti, and even read out alerts. Even more scenarios can be handled by other scripts provided by the Nagios community.

Chapter 6 goes into detail on how to integrate Nagios into an enterprise environment. This chapter goes into just enough detail to get Nagios configured to work with a large number of third-party services, such as LDAP authentication, Cacti, Puppet, and Splunk. Emphasis here is always placed on the human element; how to use Nagios to help help desk and/or NOC staff do their jobs more efficiently and effectively, and how to gain maximum support for Nagios within the organization. The importance of the human element, in all its forms, simply cannot be overstated, and the the authors have done a wonderful job of outlining a good way to make Nagios an integral part of an organization. A lot of the material towards the end of the chapter, especially the section on smaller Network Operation Centres, could be used by anyone looking for ways to help a small group work together effectively.

Chapter 7 is another chapter with a lot of content easily applicable outside of a Nagios environment. The chapter begins with the authors reminding you to know your network and to watch out for session hijack attacks, then show you how to use Nagios to do both. Nagios can't replace a competent network administrator, but it can make their lives easier and the authors show you here how the configuration you've already done on Nagios already shows you a potential session hijack attack and how it forces you to properly know your network. Nagios forces you to know your network not only by how it's built and by what devices are in use, but it also requires that you have a solid handle on what constitutes normal conditions for all your devices and services.

Another area which is very important to companies, especially companies operating in the United States, that Nagios can assist with is regulatory compliance. The authors outline how a company could use Nagios to assist with compliance with Sarbanes-Oxley (SOX) with COBIT or COSO, Payment Card Industry (PCI) Data Security Standard (DSS), Director of Central Intelligence Directive (DCID) 6/3 and Department of Defence (DoD) Information Assurance Certification and Accreditation Process (DIACAP). Nagios alone isn't enough to be compliant, at the very least detailed documentation will also be required, but the authors give a good overview of how Nagios can assist with compliance in all of these regulations.

The final chapter helps to bring the rest of the book together by walking through a full Nagios configuration for a fictional Fortune 500 corporation. The bulk of this chapter covers the pre-deployment stage of a Nagios deployment, but that doesn't mean that there isn't a lot to learn about deploying Nagios. A major hurdle towards deploying Nagios in an organization is the pre-deployment phase, and the authors outline here how to easily turn this major challenge into a series of simple steps to increase the chances of Nagios' success in your organization. From the very beginning, you can see how involving the customer early and starting small, along with everything else, becomes a part of a process. Although it's specific to Nagios, the process followed here could be easily adapted to integrating any sort of monitoring service. The remainder of the chapter is devoted to how you might integrate Nagios into a Fortune 500 company, finishing the book off with some good advice for integrating Nagios.

Despite all the book's strengths, there is some room for improvement. In chapter 2, it may have been more effective to outline the relationships between the Nagios configuration objects before discussing configuration planning. I found it much easier to think of a configuration for a large organization after knowing about how Nagios' configuration objects relate to each other.

Throughout the book, the authors have included configuration file snippets, scripts, and example script output in the main text. While all of these are quite useful and serve to enhance the book, I think it would have been better if these were all included in an appendix instead, perhaps keeping only the relevant parts of configuration snippets in the main text for clarification.

At the end of chapter 3, the sections on the future of Nagios and the CGI front end are informational and interesting, but they would be better placed in a separate chapter dealing with the potential future of Nagios in general. These and the other major areas of Nagios combined would provide more than enough material for a full chapter on their collective futures.

Overall, this is a great book for anyone using Nagios as more than a casual user, and is still very informative for the casual user. A few of these chapters alone would be worth the price of the whole book.

Disclaimer: I worked with one author when I was asked to review this book.

You can purchase Nagios 3: Enterprise Network Monitoring from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

*

This discussion has been archived. No new comments can be posted.

Nagios 3 Enterprise Network Monitoring

Comments Filter:
  • Nagios is great (Score:5, Insightful)

    by kimvette ( 919543 ) on Wednesday October 08, 2008 @02:24PM (#25303197) Homepage Journal

    Nagios is great but even version 3 is by no means easy to configure. Like all too many F/OSS projects, the documentation is lacking or even incorrect in spots, and supplied examples barely scratch the surface of what the application can do.

    I've been running it and it's great - I have it monitoring a bunch of servers (email, hosting, backup, file, etc.) with custom scripts and it works great -- once it's configured.

  • Re:Nagios is great (Score:3, Insightful)

    by kimvette ( 919543 ) on Wednesday October 08, 2008 @02:26PM (#25303243) Homepage Journal

    Ooops. submitted to early.

    Nagios is expecially helpful in a smaller environment where you have limited personnel; as long as nagios is up and running you can have it email, page, or text you so that you know there's an issue without having to have personnel monitoring it all manually - and it provides a decent log via the web interface.

    My main point is this: if this book is as good as the reviewer indicates, it should be very well worth buying if you need a F/OSS server monitoring solution.

  • not good. (Score:3, Insightful)

    by Lord Ender ( 156273 ) on Wednesday October 08, 2008 @02:27PM (#25303247) Homepage

    Is it extensible? Is it easy to use? I didn't get it the first time, better repeat it a few more times...

    My personal experience is that Nagios is probably the LEAST easy to use of any piece of software, period. I hope they changed it in a major way, because last time I tried to use it I was forced to dig through configuration files and learn syntax just to get the thing to see if some server was responding to pings.

  • by hax4bux ( 209237 ) on Wednesday October 08, 2008 @03:14PM (#25304089)

    This book is not a big leap over the supplied Nagios documentation. I bought it out of guilt, but I doubt I have gotten my moneys worth. This is not so much a criticism of the book as praise for the supplied documentation (which is rather decent, given the topic).

    Getting Nagios (or OpenView or whatever management system you have) working is a big job which will not be solved w/a $40 book and a afternoon.

    For all of you who complain about Nagios being complicated, I hope you never see OpenView (et al).

    If you haven't seen Nagios, there is a daemon which performs the collection. The UI is browser based (Apache HTTPD CGI applications). Sometimes there are agents on remote machines to collect status like process tables, disk utilization, etc.

    Nagios is essentially a job scheduler/messaging system. Monitoring is performed by invoking little programs dedicated to collecting information, and these are easy enough to create. There are lots of hooks if you need to extend the system.

    Since the UI is owned by HTTPD, so is access control. Who doesn't know how to set up LDAP or a auth file for Apache? Most of the CGI plugins are implemented in C and are not ugly to look at.

    The agent issue is a little clouded because there are many agents to choose from. I usually just use the Net-SNMP agent because I have a lengthy SNMP background, but that is just my personal choice.

    I will stop here since the article is about a book and not Nagios. I merely wanted to address some of the criticisms of Nagios.

  • Re:not good. (Score:4, Insightful)

    by walt-sjc ( 145127 ) on Wednesday October 08, 2008 @03:14PM (#25304109)

    Oh please. It's NOT THAT HARD!!!! For what it does, it's fairly simple actually. Compared to any other package of similar capability, it's quite average in terms of difficulty actually. No worse than something like Exim or Apache. Just think of each server as a vhost and each service as a location directive

  • Re:Spam alert! (Score:3, Insightful)

    by IceCreamGuy ( 904648 ) on Wednesday October 08, 2008 @03:30PM (#25304371) Homepage

    ...of an obscure product...

    The only things I can think of that would make someone say something like this are:

    You're not a systems administrator

    Your new to systems administration

    You're a bad systems administrator

    You don't keep up with grade-A open source enterprise solutions

    you work for a company that has a budget big enough that you don't ever consider open source

  • Re:Spam alert! (Score:4, Insightful)

    by SgtAaron ( 181674 ) <aaron@coinet.com> on Wednesday October 08, 2008 @03:40PM (#25304493)

    You sir, have no idea how right you are about Nagios... It spams, a lot. And depends on how well you know what you are doing, it will spam you from couple mail per hour to literally e-mail bomb you so you can't even open your e-mail client.

    I'm thinking that you may be one of those that need the book. :-) The amount and frequency of alert emails is easily configurable. And I think you need a new mail client! How about trying mutt? :)

    The "notification_interval" can be set to 0 so that nagios will only send one alert, period. Now, if you have a bunch of services/hosts down you will get a lot of messages unless you've taken steps to mitigate that. But isn't that better than *not* knowing your network has run home to momma?

    We've been using Nagios now for months and it may be the least buggy code running on any of our machines. Rock-solid, I tell you.

    regards,

  • Re:Spam alert! (Score:1, Insightful)

    by Anonymous Coward on Wednesday October 08, 2008 @03:55PM (#25304705)

    Nagios may be many things but ease of use...??? I suspect the author is suffering from crack-induced delusion.

  • Nagios is a mess (Score:3, Insightful)

    by Kent Recal ( 714863 ) on Wednesday October 08, 2008 @05:20PM (#25305733)

    Blech, nagios is probably the most disgusting hack currently in wide use. It was overdue for a complete rewrite after Nagios2 - but nagios hackers don't seem to have any pain treshold. Nowadays it's not even funny anymore. Nagios has gone *way* over its expiration date. The closest analogy would be a pot of milk that has been sitting in direct sunlight for 6 months straight...

    I strongly suggest that anyone looking for a monitoring solution stays away from the dead horse and looks at the modern alternatives first. There are plenty: Munin, Cacti, Zenoss, Pandora, OpenNMS, just to name a few.

    Most importantly: Take your time before you decide and evaluate thoroughly. A monitoring solution will stick with you for a long time and migrating to a different software is usually a very painful process. Which, btw, is the main reason why so many sites still ride the dead horse...

  • WTF? LOL... (Score:3, Insightful)

    by Colin Smith ( 2679 ) on Wednesday October 08, 2008 @05:34PM (#25305903)

    I used nagios for years.. many many years. It has to be, as many have already pointed out.. the most difficult to configure OSS project ever made.


    R$+@$=W $@$1@$H user@thishost -> user@hub
    R$=W!$+ $@$2@$H thishost!user -> user@hub
    R@$=W:$+ $@@$H:$2 @thishost:something
    R$+%$=W $@$>3$1@$2 user%thishost

    Sendmail...

    Nagios is easy, but it only makes sense if you have dozens or hundreds of systems, for less, get something simpler, and it will only work if you understand how to group your hosts, services etc.
     

  • Hard to set up? (Score:4, Insightful)

    by isorox ( 205688 ) on Wednesday October 08, 2008 @06:54PM (#25306851) Homepage Journal

    So Nagios is hard to set up? Probably, you can't go from zero to running in 5 minutes. It's a steep learning curve, but if the initial investment of a book (I used building a monitoring environment with nagios) and a few hours, you shouldn't be monitoring things. You won't do it correctly, you may as well throw some cron jobs together.

    The first step in monitoring is working out what you want to monitor. The second step is working out what you really want to monitor. The third step is working out how you want to display problems. When you have 60 people in support working on a 6 shift 24/7 pattern, you can't expect emails to be any use. "Service problems" in nagios is fine, but there's a lot of issues that 2nd line don't need to know about -- solaris security patches on an intranet for example, can wait until the 9-5 admins get in.

    Nagios is painfully easy to administer, if you set it up right. Once you know what you're doing (or even know enough to be dangerous, like myself), you can deploy a new nagios installation in about 20 minutes, add a new device that follows existing rules (new web server for example) in under 5 minutes, and a new device with new plugins in half an hour.

    Nagios then grows organically. When something strange and new breaks we cobble a plugin together,

    Configuration is in plain text files, one for each device on the network. I have these as an subversion working copy, which gives me the ability to track changes and easily roll back any configuration problems.

    We have dozens of weird bespoke plugins, one uses WWW:Mechanize and Perl to run through a workflow on a specifc webpage, another looks at the rate of change of growth of a jboss logfile, and the proportion of stack traces, one logs into a remote machine and checks jumbo pings are working through the network.

    We find nagios essential to monitor the service we provide. I don't particularly care if the server an oracle database runs on is pingable, I care if I can log in and run "select 1 from dual" (or usually something more application specific).

    The small system we monitor is made up of about 800 services over 190 devices.

  • Re:not good. (Score:4, Insightful)

    by isorox ( 205688 ) on Wednesday October 08, 2008 @07:00PM (#25306907) Homepage Journal

    Is it extensible?

    Yes, what can't you monitor with nagios?

    Is it easy to use?

    You should see our 2nd line people, if they can use it, anyone can.
    1) Big red problem appears on page
    2) They click the link to the logging system which does an asset-based search showing recent problems.
    3) They click the link to the wiki page for that host, which hints at how to fix it.
    4) Red thing goes away

    There's a difference between *use* and *configure*. Nagios is the easiest monitoring system we've ever used in our department. It's pretty easy to configure too when you know what you're doing (one config file per device host, one directory per logical division of devices, one perl script to splat out the devices, one subversion repository to version track everything)

    I hope they changed it in a major way, because last time I tried to use it I was forced to dig through configuration files and learn syntax just to get the thing to see if some server was responding to pings.

    So? What use is a monitoring program that tells you that. If you want to do decent monitoring, you want to monitor the systems, not the devices those systems happen to run.

    It's a steep learning curve, but have you ever configured apache from scratch? Let alone bind or sendmail.

  • Re:Nagios is great (Score:2, Insightful)

    by perldork ( 1381373 ) on Wednesday October 08, 2008 @10:11PM (#25308439)

    Agreed, time and money spent on integrating Nagios into an organization (or any other free OSS product) to me is much better time and money spent than spending money on licenses and paying support people for a commercial product who then not only get your money but also get the benefits of the knowledge learned from the experience instead of your company or group getting that information.

    Even that wouldn't be sooo bad except that many commercial companies don't even share that knowledge in a way that other customers can benefit from unless they pay for consulting time ... most commercial NNM producers have horrid public forums and KBs that really only cover issues related to upgrades and licensing as opposed to lessons learned by other customers.

    This of course only applies to organizations that have development/IT groups that are large enough to support custom integration efforts, I understand that there are many places who can't afford to invest in in-house development or who really do not want to learn how to do systems/application/network monitoring themselves.

  • by BitZtream ( 692029 ) on Thursday October 09, 2008 @01:34AM (#25309765)

    For all of you who complain about Nagios being complicated, I hope you never see OpenView (et al).

    I used to run an OpenView server ... my god, getting that thing to do useful stuff was like getting a cat to listen to your commands, it can be done, but why the hell bother.

    Since that job, I've come to love Nagios (which is still complicated) because its about a billion times easier to deal with than OpenView. Nagios IS complicated, but its job IS complicated and Nagios does a hell of a job when compared to something like OpenView.

    I've found however the best way to monitor servers is to just put your cell phone number in a nice public place, you get practically instant notification of a problem, sometimes you get notification years before the problem exists! You even get notification of problems completely unrelated to your services/network, heh.

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...