SNMP status

I began using SNMP to monitor nearly all the machines in the Department.

Now from my desk I can keep an eye on dozens of those electronic troublemakers and spot trends and erratic behaviour. So far I have identified a lot of idle CPU time that could be used for simulations.

Here are the uptimes for two machines in the graduate room; one that people like unplugging, and another that nobody uses.

This is all disabled. It proved its point but now I'm bored.

As an aside, here is how much idle CPU time has been recorded for the processing servers in the Department. This could have been used to run R! (This number gets reset when the machine gets restarted, so it doesn't tell you how much time has been wasted since Adam was young.)

(Add gigantic value here!)

PHP code

The code for doing this from PHP is fairly easy. First we grab the state of the host:

        $state = snmprealwalk($host, "public", ".1.3.6.1.2.1.25.1", 50, 1); 

The long dotted number is the MIB from HOST-RESOURCES-MIB, which tells interesting things about the host. Why it has so many numbers is because of Abstract Syntax Notation (ASN.1) and X.500, which are currently used all over the place in LDAP, ActiveDirectory (which uses LDAP), SSL certificates, and SNMP.

Now we need to parse this into something more useful. I do a regular expression to turn the pretty description from SNMP into the form I desire:

$uptime = ereg_replace("^.*\) ([0-9]+ .*):[0-9][0-9]\.[0-9]{2}.*$", "\\1", 
        $state['host.hrSystem.hrSystemUptime.0']);
$users  = (int)ereg_replace("Gauge32: ", "", 
        $state['host.hrSystem.hrSystemNumUsers.0']);

It's long and it's ugly. But it works.

(I also put this code sample in the PHP documentation.)

Other

Other uses for SNMP are making pretty graphs with either MRTG or the fancier, more interactive, supertastic Cacti.

It makes my job a lot easier when I can look at an automatically updating webpage and see that the systems are running properly.

SNMP is peculiar in that hrSystemUptime is how long the SNMP daemon has been running, so it will not accurately reflect how long the computer has been turned on - but is a pretty good approximation since the daemon is started automatically on boot.

Now to get SNMP running on the fridge ...


Stephen Cope 2005-01-12/2007-09-06
http://www.stat.auckland.ac.nz/~kimihia/snmp