Forgot your password?
typodupeerror

Nagios System and Network Monitoring 116

Posted by samzenpus
from the keep-an-eye-on-things dept.
David Martinjak writes "Nagios is an open source application for monitoring hosts, services, and conditions over a network. Availability of daemons and services can be tested, and specific statistics can be checked by Nagios to provide system and network administrators with vital information to help sustain uptime and prevent outages. Nagios: System and Network Monitoring is for everyone who has a network to run." Read on for the rest of the review.
Nagios: System and Network Monitoring
author Wolfgang Barth
pages 464
publisher No Starch Press
rating 9
reviewer David Martinjak
ISBN 1593270704
summary Covers installing, configuring, and deploying Nagios to monitor systems and services on a network.


The book is authored by Wolfgang Barth and published by No Starch Press. The publisher hosts a Web page which contains an online copy of the table of contents, portions of reviews, links to purchase the electronic and print versions of the book, and a sample chapter ("Chapter 7: Testing Local Resources") in PDF format.

An amusing note to begin: this is one of the only books I have read where the introduction was actually worth reading closely. Many books seem to talk about background or history of the subject without providing much pertinent information, if any at all. In Nagios: System and Network Monitoring, Wolfgang Barth begins with a hypothetical anecdote to illustrate the usefulness of Nagios. The most important section in the introduction, however, is the explanation of states in Nagios. While monitoring a resource, Nagios will return of one of four states. OK indicates nominal status, WARNING shows a potentially problematic circumstance, CRITICAL signifies an emergency situation, and UNKNOWN usually means there is an operating error with Nagios or the corresponding plugin. The definitions for each of these states are determined by the person or team who administers Nagios so that relevant thresholds can be set for the WARNING and CRITICAL status levels.

The first chapter walks the reader through installing Nagios to the filesystem. All steps are shown, which proves to be very helpful if you are unfamiliar with unpacking archives or compiling from source. Users who are either new to Linux, or cannot install Nagios through a package manager, will appreciate the verbosity offered here. Fortunately, the level of detail is consistent through the book.

Chapter 2 explains the configuration structure of Nagios to the reader. This chapter may contain the most important material in the book as understanding the layout of Nagios is essential to a successful deployment in any environment. The book moves right into enumerating the uses and purposes of the config files, objects, groupings, and templates. All of this information is valuable and presented in a descriptive manner to help the reader set up a properly configured installation of Nagios. My biggest stumbling block in using Nagios was wrapping my brain around the relationships of the config files and objects. This chapter clears up all of the ambiguities I remember having to work out for myself. If only this book had been around a few years ago!

The sixth chapter dives into the details of plugins that are available for monitoring network services. This chapter explains using the check_icmp plugin to ping both a host and a specific service for verifying reachability. Additional examples include monitoring mail servers, LDAP, web servers, and DNS among others. There is even a section for testing TCP and UDP ports.

Next, the book covers checking the status of local resources on systems. At work, we have a system in production that could have been partitioned better. Unfortunately, /var is a bit smaller than it should be, and tends to fill up relatively frequently. Thankfully, Nagios can trigger a warning when there is a low amount of free space left on the partition. From there, we have Nagios execute a script that cleans out certain items in /var so we don't have to bother with it. We can also receive notification if the situation does not improve, and requires further attention. In addition to monitoring hard drive usage, the book includes examples for checking swap utilization, system load, number of logged-in users, and even Nagios itself.

Chapter 12 discusses the notification system in Nagios. You provide who, what, when, where, and how in the configs, and Nagios does the rest. The book does a fantastic job of explaining what exactly triggers a notification, and how to efficiently configure Nagios to ensure the proper parties are being informed of relevant issues at reasonable intervals. For example, the server team might be interested to know that /var is 90% full on one of the LDAP servers; however they don't need to be notified of this every thirty seconds. This chapter also covers an important aspect of Nagios known as flapping. Flapping occurs when a monitored resource quickly alternates between states. Nagios can be configured for a certain tolerance against rapid alternating changes in states. This means Nagios won't sound the alarm if the problem will resolve itself in a short period of time. Usually flapping is caused by an external factor temporarily influencing the results of the test from Nagios; and therefore has no long-term impact.

The last major chapter to mention here deals with essentially anything and everything about the Nagios Web interface. The main point of interaction between the administrator and Nagios is the fully featured Web interface. This chapter covers recognizing and working on problems, planning downtimes, making configuration changes, and more. I especially like that the book gives an overview of each of the individual CGI programs that the Web interface is composed of; as these files are important for UI customization.

The only aspect of this book that I did not care for was that the book reads like a reference manual at times. The first several chapters start out more conversational in tone with great explanations of the procedures and files; but later it sometimes feels like I am repeatedly reading an iterated piece-by-piece structure, filled in with the content for that chapter. That is not necessarily bad all together as it does provide consistency in the presentation of the information. Additionally, the level of detail is outstanding throughout the book. The explanations are never too short or too long. This is definitely a valuable book for administrators at all levels with fantastic breadth and depth of material. Administrators who are interested in proactive management of their systems and networks should be pleased with Nagios: System and Network Monitoring.

Nagios is licensed under the GNU General Public License Version 2, and can be downloaded from http://nagios.org.

David Martinjak is a programmer, GNU/Linux addict, and the director of 2600 in Cincinnati, Ohio. He can be reached at david.martinjak@gmail.com.


You can purchase Nagios: System and Network Monitoring from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
This discussion has been archived. No new comments can be posted.

Nagios System and Network Monitoring

Comments Filter:
  • This chapter also covers an important aspect of Nagios known as flapping. Flapping occurs when a monitored resource quickly alternates between states. Nagios can be configured for a certain tolerance against rapid alternating changes in states.
    But I cant find out how to set the alarm when the boss flaps.
  • by Anonymous Coward on Wednesday April 11, 2007 @03:50PM (#18693893)
    Please forgive my anonymous coward use: my comments would reveal my name too well.

    I'm an *OLD* Netsaint and Nagios user, and have contributed to both. Guides are great, playing with it is great, and it does a lot of things very well. But what Nagios has never had is a way to publish the URL's of specific queries or reports in a way that can be bookmarked and sent to someone else for reference. It's a big, big, big flaw in the system, common to a lot of web-based projects.

    The other huge, huge flaw of Nagios is configuring it. It shouldn't take a reference book from O'Reilly to do this efficiently, but I'm afraid it does. There are easily a dozen different configuration tools at www.nagiosexchange.org and sourceforge.net, and *every single one of them* has major problems that could be solvd with 10% of the time spent on Nagios itself. Most are abandonware, exciting but uncompleted projects that are never going to be completed. Others rely on hand-compiling Nagios itself with strange local modifications and local configurations that are very difficult to import a working Nagios to, or export from. Others have absolutely *no* security model, incapable of securing access to them or relying on locally stored plain-text password setups: others rely on non-privileged accounts to edit the Nagios configurations, including the password files for databases or proxy services, in semi-public repositories. Others rely on installing every file in a browseable web directory, permitting local unauthorized to poke the guts of and use the security flaws. (Yes, you perl idiots who execute random file and directory creation without checking if it's empty first or protecting it from being written into by other people before you copy its contents, I mean you!)

    Other configuration tools have beautiful "artist conception" interfaces that will make your eyes bleed aft 20 minutesworking with it. Every last one of them listed at Sourceforge and NagiosExchange suffer from one or many more of the major open source GUI flaws Eric Raymond ranted about in hisi CUPS horror story, years ago.

    It's unfortunately so bad that I've had to throw away weeks of work and switch to Altiris on a major project, which is fairly painful to switch to but at *LEAST* has a usable interface.
    • by rprague (653431)
      Being a long time nagios and netsaint user and contributor to the community, I have to say your comments are 100% dead on.

      The configuration of nagios is confusing even for a seasoned user, the security models are non-existant and adding even simple graphing and historical data to nagios requires another entire level of ridiculous configurations.

      Nagios was a fantastic tool, in 2001. However, it is basicly the exact same tool today that it was in 2001 and there are far better tools available now that do the
    • Re: (Score:3, Informative)

      by walt-sjc (145127)
      I've been using nagios for nearly 2 years too, to monitor about 80 servers. Also running the NRPE plugins to monitor things like disk space, load, and a number of other aspects.

      I agree that the configuration is pretty bad, and your other points on the interface. Dependencies are a nightmare to configure.

      That said, it does work, and requires very little maintenance once it's setup. It helps to use one file per server too, since you can include entire directories that contain configuration files. What I did w
      • Re: (Score:3, Interesting)

        by schlick (73861)
        Have you looked at Hyperic? http://www.hyperic.com/ [hyperic.com] I'm using the open source version and I like it alot.
      • I guess that's the main problem. I shouldn't need to write complex config files for it to work. I would even PAY to have someone write a web interface similar to Keynote's Red Alert. I simply want to specify the host to monitor, the services to monitor, who to page when something breaks, and be done with it.
        • Guys, It really doesn't need to be that hard. The primary nagios.cfg allows you to arrange the configuration any way you choose. Most people use large, object-specific config files, but there's no reason you couldn't break your config's down into host-based entities which contain a host, the services on that host, and the contacts/contactgroups for that host. A really lightweight shell wrapper could be used to cat together pre-configured "templates" into a hostX.cfg file. It could ask you questions like
    • by m0i (192134)
      It's unfortunately so bad that I've had to throw away weeks of work and switch to Altiris on a major project, which is fairly painful to switch to but at *LEAST* has a usable interface.
      altiris, just bought by Symantec.. expect the best, prepare for the worse.
    • by jimicus (737525)
      Forgiven.

      I use Nagios myself as I was looking for a quick and dirty replacement for Big Brother.

      While it's a fantastic tool, my biggest beef with it by a LONG way is that configuring a new server to monitor is always a case of "hand-edit this config file and that, figure out what's important to monitor and what isn't, realise 3 months later that you missed out something you really should be monitoring...." aarrgh. Templates help hugely but they're only part of the solution.

      If you're going to make monitori
      • While it's a fantastic tool, my biggest beef with it by a LONG way is that configuring a new server to monitor is always a case of "hand-edit this config file and that,

        What is the deal with people being afraid of config files?

        figure out what's important to monitor and what isn't

        What? You're complaining because Nagios makes you figure out what is important to monitor?

        realise 3 months later that you missed out something you really should be monitoring

        So...your lack of planning is somehow nagi

        • by jimicus (737525)
          Nagios is just not that hard to configure. If you had a web front end, then every time a change to the capabilities was made you'd have to go back through and update the front end to make it support it, which means more time between releases, longer turnarounds for new features, and likely less flexibility in the system in general.

          You'd better tell the developers who are sat next to me that one. They think they're using a toolkit which practically gives them a web-based interface for free when they develop
          • You'd better tell the developers who are sat next to me that one. They think they're using a toolkit which practically gives them a web-based interface for free when they develop the command-line interface.

            Er - that has absolutely nothing to do with what I just said. Command line interface != Config File.

            You know, rather than just tell me that I'm asking for the moon, you could try ZenOSS. You get a heck of a lot of flexibility and power with substantially less complexity.

            I've looked at ZenOSS. I'm sure that with your (apparently) limited experience and use of network monitoring that it seems like a lot of flexibility and power. That's not meant to be insulting, just an observation.

            There seems to be a certain idea in the Linux community that just because "it's a community-developed Unix" it has to be bloody awkward to get basic things to work.

            Well, again, I deny that it's "bloody awkward" to get basic things to work in Nagios. I can have nagios up and running for basic needs in minutes.

            I think the real problem here is this attit

    • Very good points. Looks like this article agrees with you - http://searchenterpriselinux.techtarget.com/origin alContent/0,289142,sid39_gci1250897,00.html [techtarget.com]
    • The other huge, huge flaw of Nagios is configuring it. It shouldn't take a reference book from O'Reilly to do this efficiently, but I'm afraid it does. There are easily a dozen different configuration tools at www.nagiosexchange.org and sourceforge.net...
      I think the other huge flaw is that 50% of the users visiting www.nagiosexchange.org are probably looking for different "configuration tools" in the first place.
    • Re: (Score:2, Informative)

      by Anonymous Coward
      I've got to agree. We use it at an ISP level to monitor various functions, both leased line and server functions based on customised scripts, easily several thousand devices are being monitored primarily through Nagios. The theory being we can contact customers pro-actively when they experience connectivity issues, as a free function of business. As a natural side effect of having acquired other ISPs over the years our monitoring system is multi-faceted depending on each ISPs platform quirks. Great for
    • by calethix (537786)
      You might want to check out Groundwork OpenSource [groundworkopensource.com].

      It's built on Nagios and several other projects. So basically Nagios with a really nice gui front-end to get things setup. I've been messing around with the free version to evaluate it as a replacement to big brother.

      It took me a little while to get all the connections straight in my head but would probably be more intuitive to someone with more experience in the area.
    • Agreed; basically Nagios a mess, but it's pretty-much the standard unfortunately, as it kinda-sorta gets the job done.

      My main problem with the current crop of monitoring tools is that they are all either about availablility (Nagios, et al) or performance (MRTG, Cacti). Currently I'm using Nagios+Cacti, which kinda-sorta works for me, but it would be nice to have a single coherent interface to my systems. Zenoss [zenoss.com] also looks interesting, although I haven't tried it yet, but I'd like to hear of any other poss
      • Re: (Score:2, Interesting)

        by thurgoodj187 (905656)
        Zenoss has a Virtual appliance out on the VMWare site, makes it real easy to test and evaluate! I've got it running (Whenever I've got my laptop up)
    • I am always amazed how people can complain about free stuff. Nagios is Open Source, so it's free on one side, and maybe clumsy on the other, but at least you have still the choice of : 1) Using a 3rd party installing it for you if you're not smart enough 2) Tune the code if you don't like it and share that with others if you're smart enough 3) Give a try (and multiple ten thousands of $) to CA-Unicenter, HPOV and the like. Having checked out 3) already, I would rather hire a 100% monitoring guy twinkling
    • I'm an *OLD* Netsaint and Nagios user, and have contributed to both. Guides are great, playing with it is great, and it does a lot of things very well. But what Nagios has never had is a way to publish the URL's of specific queries or reports in a way that can be bookmarked and sent to someone else for reference. It's a big, big, big flaw in the system, common to a lot of web-based projects.

      I'm also an "OLD" Nagios user, as well as the author of the Addison Wesely Nagios book, so I might be biased, but I think you're kind of missing the point. Nagios is just a task efficient scheduling and notification engine. It's job is to schedule the execution of monitoring plugins, interpret their output, and take user-defined actions based on that output. It's flexability derives from this core minimilist approach. The plugins, and web front end are separate entities, and shouldn't really be consid

      • Dave has it right. Use management software packages for what they were designed to do. Nagios is designed to collect data on the availability and performance of servers, apps, and network devices, and to bug you when they have a problem. So far, the IT management tool industry doesn't seem mature enough to wrap all the possible functions we as sysadmins (or our management) need and want into a coherent package, open source or proprietary. One approach to getting around this is that of GroundWork Open Source
  • Others (Score:4, Informative)

    by Colin Smith (2679) on Wednesday April 11, 2007 @03:51PM (#18693903)
    zabbix
    jffnms
    opennms

    etc.

    I found nagios rather clunky compared to some of the others.

     
    • by bomonguny (564572)
      What others, Inquiring minds want to know
      • Re: (Score:3, Interesting)

        by phish (46788)
        Try Hyperic: http://www.hyperic.com/ [hyperic.com]

        GPL, 30-minute or less setup time, auto discovery and built in support for monitoring, controlling, and log tracking for anything you can think of. 9 OS's, 42 apps, network devices, extensible plugins....

        Nagios is great, but I agree with the parent that the time it takes to set up and maintain is unreasonable. Oh, and yes, I'm biased. I work for Hyperic.

        -javier
      • Re: (Score:3, Interesting)

        by halfloaded (932071)
        Zenoss
        Cacti
        BixData
        MRTG
        etc, etc, etc...


        This site [stanford.edu] has the biggest database of NMS's around.
        • by BagOBones (574735)
          I think Cacti is the only one listed that has a reasonable install learning curve. Everything else requires deep investment in setup.

          For a smaller operation or smaller feature set needs I really like Cacti.
    • The Nagios codebase is considerably older. It was written before mod_perl and PHP were in broad use, when binaries in a webpage meant using cgi-bin.

      There are plenty more monitoring tools. Bigbrother and Bigsister come to mind, although Bigbrother was ruined when it went commercial. And despite the claims of the anonymous coward above, there are some workable GUI's, although I admit that they do need work to make commercial or production grade.

      A good Nagios book would certainly be welcome on my bookshelf: it
      • I'm fairly geeky, though by no means a programmer, I can get most FOSS programs compiled and working, but holy crap Nagios is a PITA! BigBrother is still pretty good, though somewhat lacking in many places. I'm switching to Zenoss here, it looks to be quite good, though I'll miss the main status screen of Big Brother.
        • You got Zenoss working? Undre which OS distribution? And did you publish notes, or use an installation guide?
          • by Bimo_Dude (178966)
            I just finished getting Zenoss working on a test box two days ago (CentOS 4.3 VM -- not the Zenoss VM image, though). I'm currently testing it to monitor some Windows servers, as that is what we mostly have.

            There were a few things missing from the manual installation docs. Here are the steps I used to get it up and running:

            1. rpm -Uvh perl-Socket6-0.19-1.2.el4.rf.i386.rpm
            2. rpm -Uvh perl-Crypt-DES-2.05-3.2.el4.rf.i386.rpm
            3. rpm -Uvh perl-Net-SNMP-5.2.0-1.2.el4.rf.noarch.rpm
            4. rpm -Uvh MySQL-client-standard-5.0.24a-0.r
              • Install on a Windows host, version 2.4 of Python
              • Install on same Windows host, pywin32
              • Install Zenwin

              If you control all the servers you might be able to get away with that. Try working in a company where there are so many servers responsibilities are distributed between groups, then convince them they need to install python/pywin32 so you can monitor their systems. Good luck with that.

              • by Bimo_Dude (178966)

                If you control all the servers you might be able to get away with that. Try working in a company where there are so many servers responsibilities are distributed between groups, then convince them they need to install python/pywin32 so you can monitor their systems. Good luck with that.

                Fortunately, I am now in a smaller environment (~ 400 servers) where I can do just that if I need to. I do understand what you mean, though; where I last worked (> 5000 servers, several admin groups), I never would have be

                • I was simply responding to somebody's question about if/how they got Zenoss up and working, from a technical perspective, not a political one.

                  I understand, and that's how I read you. However, for the benefit of the larger discussion going on it was a good opportunity to point out the political reality many (most?) admins have to deal with that so often gets neglected in the slick presentations by these companies.

            • OK, I'm afraid this is typical of a lot of unfinished open-source tools.

              * I see no PGP or GPG signatures on the Zenoss RPM's. This is always bad, especially for software doing core infrasture tasks like system monitoring.

              * Install an RPM for MySQL that conflicts with the built-in version of every deployed OS known to Linuxkind. That's understandable, but it means you've left out a critical step: start with a clean box with no MySQL installed on it, because they can't be parallel installed and it *will* modi
              • by Bimo_Dude (178966)

                Install an RPM for MySQL that conflicts with the built-in version of every deployed OS known to Linuxkind. That's understandable, but it means you've left out a critical step: start with a clean box with no MySQL installed on it, because they can't be parallel installed and it *will* modify if not break your existing MySQL installation. And step away from ever being able to get security or bugfixes from the OS vendor by doiing so.

                D'oh! I did forget to put that step in. I did have to start with a clean box w

      • I couldn't agree more - Nagios has stood the test of time. The fact the No Starch Press is publishing a Nagios books is a good indication of just how widely used it is. I agree the we interface is a bit dated, but the real power of Nagios isn't in the front end - it's the robust notification and complete extensibility that keep me using it. I monitor jut over 500 servers representing about 5000 checks (around 10 per server). There has not been a n issue yet that I couldn't get Nagios to somehow monitor. Th
      • Anyone who was running the (almost) open source Bigbrother would be better of moving to hobbit http://hobbitmon.sourceforge.net/ [sourceforge.net]
        It monitors everything I want in Linux and Windows systems and can support SNMP
      • The Nagios codebase is considerably older. It was written before mod_perl and PHP were in broad use, when binaries in a webpage meant using cgi-bin.

        And you think cgi-bin binaries are a thing of the past? I can assure you they are not. And new != better. The code in nagios is very good, and there's no reason to abandon it for a rewrite in PHP just because it's the latest fad. Nothing against PHP, but there's no compelling argument for moving to that just because. (Not that you were necessarily advocating that, but it is a common argument)

        • Oh, I'm not advocating the discarding of cgi-bin at all! It's just that compared to some of the modern, prettier ways of doing things, it certainly does *look* clunky.
    • Much easier to set up and get running - http://www.hyperic.com/ [hyperic.com] Not to mention supports more platforms than all of the others.
      • MyNewPlace.com just replaced a two year, painful investment in Nagios. Turns out SNMP isn't the best way to manage *everything*. Latency caused huge alert storms. Anyway, they looked at HP and after recovering from the sticker shock and realizing that it was going to take an army of consultants to build workarounds to functionality that wasn't there, they landed on Hyperic. Took 1.5 hours to convince them. Nagios network monitoring felled by SNMP false alarms [techtarget.com]
        • Not, I'm sure, that you're intentionally trying to be misleading, but from the linked article, and emphasis mine:

          "Nagios was not really the problem," Shin said. "It was the JVM stack not being able to respond to it correctly. It was recording events in SNMP that were then watched by Nagios and that made things crawl. There were a lot of man hours wasted, and it would trigger the 4 a.m. pages."

          In other words, tool, job, GIGO.

        • Hahaha. As the other fellow said, this is hardly a "painful investment in nagios". And just FYI, Nagios doesn't monitor with SNMP, it provides a framework for monitoring and this company built on that - poorly. They didn't get the results they wanted. Again, not Nagios' fault. What they really needed was to make a better interface for monitoring their App than SNMP. Instead, they switched to Hyperic and it worked for them.

          This article is a prime example of the absurd war against Nagios currently bei

    • Re: (Score:2, Interesting)

      by aclark4life (639571)
      There's also ZENOSS (http://www.zenoss.com/), I didn't see anyone else mention so I thought I would. Haven't tried it yet but I like that it's Zope based (because I am a Zope consultant).
  • I thoroughly enjoy the event handler capabilities built into Nagios. Just that single feature has made my day to day administrative tasks easier, and well worth the hours to write the scripts and get it all configured properly.
    For example, it's so nice to have the spooler service on a win32 box restart automatically if it has locked or died unexpectedly, and not have to wait for the calls to come in when users can't print.
    • by dmihalko (966391)
      oh so very true... through the use of openssh for win32 and public key authentication... one can accomplish all kinds of useful scripts to attempt to automate the recovery of downed services on any windows/*nix server. i love it.
  • At the risk of getting off-topic, I'm tired of stuff that doesn't quite work. (can't comment on the actual book because I haven't read it) However, I can't see how Nagios can even begin to satisfy the needs of most modern IT operations folks. These days, most people need to know a lot more than whether machine X is up. They need to know which part(s) of their web apps are not functioning correctly. They need a lot more intricate detail than is possible with Nagios or SNMP-based monitoring tools. Really, th
    • by osbjmg (663744)
      I love when people say "these days". So back in 1995 things were different? When did they change? Why the time reference?
      • by bartwol (117819)
        You know...these days, when people really understand computers and apply system/application monitoring paradigms that didn't exist in the pre-iPod era.

        You know...don't you?

        (Nagios does a great job for me doing the stuff the parent poster talks about; he's as transparently shallow as you suggest.)
        • by osbjmg (663744)
          I would argue that computer proliferation to the masses means that your average computer user knows LESS "these days" than the average computer user a few years ago. Laptops outsell desktops in the consumer market and the few make it easier for the many. The more people on the network, the easier it gets to find help and guides to do just about anything. More work is done for you and you don't have to be on your own. Think about maybe your grandmother using the emailz on the interweb, it's possible now
    • by Cylix (55374)
      Depending on the web app...

      Nagios functionality can be easily extended with a custom check script that would interact with all or some of an applications web app functions of that host.

      It would be a matter of parsing the return material and simply passing a check var.

      Yeah, not an extremely involved, but I merely posed it as a 'well, yeah, it kinda can idea.' Some of the other features I noticed (ie, monitoring, get/post, bytes) could be implemented as well with some minor reporting.

      With that said, all of th
    • Since Nagios can execute arbitrary scripts, couldn't you rig up a Perl script using Test::Harness and WWW::Mechanize to parse the web app and catch the return codes off that script?
    • I'm not sure what the deal is, but lately I've noticed there seems to be almost a hatred of Nagios coming out of the Hyperic people. I think it's probably fear...

      Anyway, you said... At the risk of getting off-topic, I'm tired of stuff that doesn't quite work. (can't comment on the actual book because I haven't read it) However, I can't see how Nagios can even begin to satisfy the needs of most modern IT operations folks.

      Well, maybe you need to spend some time as an actual modern IT Operations "folk".

      • of course Nagios is flexible. It's the time to setup and maintenance that costs you.

        And as far as "hatred of nagios" I've witnessed that firsthand when I've run BoF's on Nagios, and I've run a few - at LISA and LinuxWorld.

        But I love your snarky comments. They r0x0r :)

        Oh, and I almost hate to ask, but can you install RPM's on Windows? (har har)

        -John Mark
        • of course Nagios is flexible. It's the time to setup and maintenance that costs you.

          Ah yes. The old "if it's complex, then it's a waste of time." canard. Interesting, the last company I heard push that line hard was Microsoft against Linux. SFDD, Same FUD, different day.

          And as far as "hatred of nagios" I've witnessed that firsthand when I've run BoF's on Nagios, and I've run a few - at LISA and LinuxWorld.

          Yup. I've witnessed it too. Much the same as I see it here. Doesn't make it rational. Like I said, there's a general trend that says if I can't push a button and have it be done it's too hard. Again, I reject that as absurd and flawed on its face.

          When I go looking for a *nix systems admin these days, I go throug

        • by Emrys (7536)
          If you're going to respond in this thread, I for one would really like to see you address the questions about technical depth of Hyperic. What we are tired of is the "use Hyperic because Nagios is hard" ad homenims. There's no meat behind them. Even if we grant Nagios is hard (which IMO is baseless, but whatever), if Hyperic can't do what we need, and Nagios can, who cares? There is such a thing as necessary complexity.

          So look at http://slashdot.org/comments.pl?sid=230333&cid=187 07053 [slashdot.org] and http://sl [slashdot.org]
    • by vidarh (309115)
      These days, most people need to know a lot more than whether machine X is up. They need to know which part(s) of their web apps are not functioning correctly. They need a lot more intricate detail than is possible with Nagios or SNMP-based monitoring tools.

      Seriously, if you think that you have no clue what Nagios does/can do. The monitoring in Nagios is 100% based around probes that are not built into Nagios, though a typical Nagios install comes with a huge number of standard probes. The only hard req

      • Many poeple here have contributed to various Nagios plugins. Pretending that the plugins are separate from your sandwich is like pretending that the bread is not part of a hamburger, or like kernels without modules. A few people use them that way, but they're quite rare.
  • Groundwork is a great unification of Nagios and other tools that provides the missing configuration interface Nagios lacks.

    http://www.groundworkopensource.com/products/os-ov erview.html [groundworkopensource.com]

    There's a VMware appliance available if you want to take it for a quick spin around the block.

    http://sourceforge.net/project/showfiles.php?group _id=160654&package_id=222764 [sourceforge.net]
  • I'm surprised people still use these 'svn co && ./configure && make install && edit config files' systems. You can download Hyperic HQ, install it, and be monitoring your software and hardware in 30 minutes -- no joke. Want alerts when your disks are full? Cake. Want to autodiscover your Apache server? Cake. Want an alert when a process goes haywire? Cake.

    And since it has a pluggable framework, you can monitor anything that you want -- network devices, software, hardware,

    • Re: (Score:1, Troll)

      by Mark Bainter (2222)
      You have to be kidding. Objectively better? Perhaps you'd care to quantify that?

      Maybe you can start with the fact that it runs in Java. Including the agent. Nagios is light years ahead of Hyperic, but this one fact alone is enough to disqualify Hyperic from ever showing up in my production environment. In fact, I might make this a new interview question for disqualifying candidates. "Would you run Hyperic as a monitoring system?"

      Anything other than "Hell no!" and the interview is over.
      • Re: (Score:2, Insightful)

        by Jick (29139)
        Why is Java bad? This isn't 1996 anymore. Have you ever run HQ? It would be shame to throw someone out of your interview over that! ;-) If you want to argue about objective features, then point them out.

        Look at the installation procedure: Nagios documentation starts out with telling you that you'll need root access, a compiler, libGD, etc. Hyperic HQ comes with an installer that does all the work for you.

        Where do the 'light years' come into play? Feature for Feature, Nagios and HQ have a lot of the s
        • Why is Java bad? This isn't 1996 anymore. Have you ever run HQ? It would be shame to throw someone out of your interview over that! ;-) If you want to argue about objective features, then point them out.

          Java is bad because it's a huge runtime environment for something as simple as an agent. Linux could probably handle it, but *why*? On windows I would never dream of installing Java + anything else and still expect it to perform, anymore than I would any other two apps on the same server. You're just

          • by vidarh (309115)
            If I'm going to install an agent, I want it to be small, non-intrusive, have little or no dependencies and be reliable. I don't ascribe any of those things to a java based agent.

            I'd like to second that... We have Nagios probes written in C, Perl and Ruby so far. Nagios is ugly, but it works, and the fact that the only real requirement for a probe is that it does something and spits out a string that starts with OK/WARNING/CRITICAL to standard out is one of the important features. Setting up monitoring

      • by Guider (643837)

        We've been running Hyperic (both free and enterprise versions) for quite a few months now, both in-house and at client sites all across the US. We monitor everything from a single, stand-alone Apache server on Linux, to a multi-site network running custom apps/Tomcat/Apache/Oracle/MySQL on Linux/HP-UX/Windows, multiple firewalls, routers and switches.

        We've used Nagios. We've used Zabbix. We've used OpenView. We've used Cacti(different class, I know). We've tried countless other monitoring tools/solutions.

        • I'm glad you like it, and that it works for you. You presented actual positive feedback about a product that did what you needed, which I think is valid. What isn't valid is going onto a web forum and trying to make your product look better by denigrating a product you don't really understand (as the gentleman from Hyperic did). (And yes, I feel perfectly safe saying that. I've been using Nagios since the NetSaint days, and it has never failed me. It works like a charm.) I've tried most other monitori
        • by vidarh (309115)
          Features, to me, are meaningless if it takes a PhD to build/configure/maintain them.

          Seriously, if you think it's that hard to build something like Nagios you should not be allowed anywhere near any production servers.

          • by Guider (643837)

            Let's not take things to extremes, and don't take my comment out of context, as you both have.

            Nagios is complicated compared to many other products. The simple fact that some rather large books are available points to that fact. But as others have pointed out, it doesn't have to be that way, and as Hyperic shows. If you have two tools that have the same features, but one takes a month to install and the other a week, which do you choose? I don't shy away from a process simply because of complexity, but ne

            • First, I don't think I took your comments out of context or strayed from the topic, but if I'm mistaken feel free to demonstrate where specifically I did that.

              Nagios is complicated compared to many other products. The simple fact that some rather large books are available points to that fact.

              That doesn't necessarily follow. Are you really going to argue that the size of the books available indicates the complexity of the software in question?

              But as others have pointed out, it doesn't have to be that way, and as Hyperic shows. If you have two tools that have the same features, but one takes a month to install and the other a week, which do you choose?

              What Hyperic shows is that just like most of the commercial tools, if you make it easy, given a slick presentation, and badmouth the competition you can get some people to buy/use your product. I

      • by LizardKing (5245)

        If your attitude towards Java is anything to go by then I doubt you are in an important decision making position anyway, but if you are, then I definitely wouldn't want to rely on you to look into possible solutions for systems that I develop. Let me guess you're a PHP guy.

        • If your attitude towards Java is anything to go by then I doubt you are in an important decision making position anyway, but if you are, then I definitely wouldn't want to rely on you to look into possible solutions for systems that I develop. Let me guess you're a PHP guy.

          Heh. Thankfully, being an adoring fan of Java isn't a requirement for "important decision making positions". I'm not sure where you got the idea that it was.

          That being said, my "attitude" wasn't towards Java, it was towards using Java for the wrong things. I run Java apps. In fact, one of my favorite apps is Zoe [zoe.nu] which is a Java app. My phone is Java based, and I even learned Java so I could write some stuff for it. My primary issue is with using it as an agent. Secondarily I would have a hard time

    • Re: (Score:3, Interesting)

      by Emrys (7536)
      You know, I was reasonably interested in Hyperic and ZenOSS when they were first announced. Competition is good, and though I'm quite happy with what I've been doing with Netsaint and then Nagios (yes, "in the Enterprise"), I was glad to look at them and see what new things they brought to the table.

      So far I've been utterly disgusted by the FUD and BS you guys are spewing, and I've lost about all interest in caring what you think you're bringing to the table. I've yet to hear any of you actually do a mean
  • I had to learn everything from code already working and tweaked the hell out of it. Was actually a fun project for my internship. Sure wish I had a book back then.
  • Looks like they've come out with another fine book. I've known those guys for a long time... now if they could just publish a book on Hyperic... ;)

After an instrument has been assembled, extra components will be found on the bench.

Working...