ifdesc=alias problems with Mrtg cfgmaker

Today I had problem with MRTG’s cfgmaker. When I used the parameter ifdesc=alias it didn’t pick up the correct SNMP OID’s, instead it defaulted to ifdesc=desc. Hmm, very strange. Then I started to investigate the problem and found out that cfgmaker can be debugged because it is a perl script.

To be able to debug cfgmaker you only have to edit one line at the top of the program.
@main::DEBUG=qw(base snpo snpd);

After doing this I ran cfgmaker again and spotted that it did not do any SNMP walk on alias (ifAlias). It turns out that cfgmaker has to know what kind of hardware it is polling to be able to use ”alias”. In my case I was polling an ”Ericsson SSR 8020” which wasn’t recognized by cfgmaker.

I had to add ”Ericsson” to the vendor identification list. If your hardware is not in that list, you can’t use the alias option.

# vendor identification
my %vendorIDs = (
# Add your vendor here
# sysObjectID Vendora
'' => '3com',
'' => 'hp',
'' => 'cisco',
'' => 'dellLan',
'' => 'extremenetworks',
'' => 'foundry',
'' => 'force10',
'' => 'juniper',
'' => 'nokiaipsofw',
'' => 'portmaster',
'' => 'ericsson'
foreach (keys %vendorIDs) {
$DevInfo{Vendor} = $vendorIDs{$_} if ($DevInfo{sysObjectID} =~ /\Q$_\E/);
debug('base',"Vendor Id: $DevInfo{Vendor}");

Then ericsson had to be added to the InterfaceInfo subroutine:

if ($routers->{$router}{deviceinfo}{Vendor} eq 'cisco' &&
$routers->{$router}{deviceinfo}{sysDescr} =~ m/Version\s+(\d+\.\d+)/) {
push @Variables, ($1 > 11.0 or $1 < 10.0 ) ? "ifAlias" : "CiscolocIfDescr"; if ($1 > 11.2) {push @Variables, "vmVlan";};
if ($1 > 11.3) {push @Variables, "vlanTrunkPortDynamicStatus";};
} elsif ( $routers->{$router}{deviceinfo}{Vendor} =~ /(?:hp|juniper|foundry|dellLan|force10|3com|extremenetworks|ericsson)/) {
push @Variables, "ifAlias";

Then it worked! cfgmaker finally walked ifAlias for the interfaces.

MRTG and Cisco-problem for unrouted VLAN’s

We had a problem with mrtg (version 2.14.5) and Cisco-routers (Cisco IOS Software, c7600s72033). The problem was that cfgmaker did not find the 64-bit traffic counters for ”unrouted vlan’s”, instead it used the 32-bit counter. When the utilization of such a port reached above 100 Mbit the counter ”rolled over” to zero and the graph looked like it was cut.

If you need to troubleshoot cfgmaker you should use snmpwalk to manually look at the counters from the device that you are monitoring. By doing so you can watch the traffic-numbers from the source. These are some of the ones that cfgmaker uses.

Get basic system information
snmpwalk -c community -v 2c ip-address system

Listing 32-bit traffic counters
snmpwalk -c community -v 2c ip-address ifinOctets

Listing 64-bit traffic counters
snmpwalk -c community -v 2c ip-adress ifHCInOctets

Check the interfaces speed
snmpwalk -c community -v 2c ip-adress ifSpeed
snmpwalk -c community -v 2c ip-adress ifHighSpeed

Check status of the interfaces
snmpwalk -c community -v 2c ip-adress ifOperStatus
snmpwalk -c community -v 2c ip-adress ifAdminStatus

List everything about the interfaces
snmpwalk -c community -v 2c ip-adress if

When I tested my router I could clearly see that it had 64-bit counters. However, cfgmaker couldn’t find them. Why?

IF-MIB::ifHCInOctets.1 = Counter64: 0
IF-MIB::ifHCInOctets.11 = Counter64: 1436496000
IF-MIB::ifHCInOctets.21 = Counter64: 1657770660
IF-MIB::ifHCInOctets.31 = Counter64: 8220
IF-MIB::ifHCInOctets.41 = Counter64: 219538030

There is a bug that is responsible for this behavior. Maybe the bug belongs to Cisco because they mark these interfaces as having zero speed. Very strange!

The workaround is (as stated in the bug report) to modify row 907 in cfgmaker to:
if((!defined $speed) or $counter eq ”” or $counter !~ /\d/ or $SNMP_Session::errmsg or $Net_SNMP_util::ErrorMessage?){

If you want to debug cfgmaker change to the rows below in the beginning of the cfgmaker program-file:

@main::DEBUG=qw(base snpo coca);

How to monitor hard drive temperature with MRTG and FreeBSD

I really love to monitor my servers, so why shouldn’t I also monitor the temperatures of my hard drives? In my mind a cooler hard drive lives longer than a hot one. However, the google paper on failure trends of hard drives says that the hard drive temperatures doesn’t have an impact on a hard drives life length. But I want things to run cool in my server so here is how I monitor the hard drives with mrtg and FreeBSD.

To be able to get the hard drives temperatures we have to install SmartMonTools.

The smartmontools package contains two utility programs (smartctl and smartd) to control and monitor storage systems using the Self-Monitoring, Analysis and Reporting Technology System (SMART) built into most modern ATA and SCSI harddisks. In many cases, these utilities will provide advanced warning of disk degradation and failure.

Install SmartMonTools
# cd /usr/ports/sysutils/smartmontools
# make install clean

Check the temperature of the drives that you want to monitor
# smartctl -a /dev/ad4 | grep Temperature_Celsius
194 Temperature_Celsius 0x0022 113 103 000 Old_age Always – 34
# smartctl -a /dev/ad6 | grep Temperature_Celsius
194 Temperature_Celsius 0x0022 148 091 000 Old_age Always – 30

Create a script called disktemp.sh that outputs the temperatures in a Mrtg friendly way.

/usr/local/sbin/smartctl -a /dev/ad6 | grep Temperature | awk '{print $10}' | tail -n 1
/usr/local/sbin/smartctl -a /dev/ad4 | grep Temperature | awk '{print $10}' | tail -n 1
echo "A long time"
echo ""

Then create a mrtg cfg file called disktemp.cfg:

### Global Config Options
WorkDir: /home/www/mrtg
Options[_]: growright, bits
EnableIPv6: no
Language: swedish

# Sysload
Target[disktemp]: `/usr/local/etc/mrtg/disktemp.sh`
MaxBytes[disktemp]: 100
Options[disktemp]: integer,gauge,nopercent
YLegend[disktemp]: Temp (C)
ShortLegend[disktemp]: (C)
Legend1[disktemp]: /dev/ad4
Legend2[disktemp]: /dev/ad6
LegendI[disktemp]: /dev/ad4
LegendO[disktemp]: /dev/ad6
Title[disktemp]: Hard drive Temperature

Hard drive temperature

Run mrtg (a couple of times)
/usr/local/bin/mrtg /usr/local/etc/mrtg/disktemp.cfg

Check If everything seems to be working. If OK then add this to crontab…

The result should be looking like this (in your language):

How to monitor cpu temperature with MRTG on FreeBSD

Have you ever wanted to monitor the cpu temperature on your server with MRTG and FreeBSD? Of course you wanted that if you have a server running in a closet or some other special place. By following my steps you can have a nice graph like below. It is really relaxing to have all vital stats from your server available online.

This small tutorial works if you have the same processor as I have (Athlon 64). The only thing besides MRTG that you need is the k8temp program which is described as ”k8temp is a utility to read the temperature sensors provided by AMD K8 and K10 processors, including most Athlon 64’s and Opterons.” If you don’t have a Athlon 64 cpu there are other programs that should be ok to get the temperature as well.

You should have MRTG installed and be somewhat familiar the concept to get this working…

Check your cpu (if you dont have a clue)
> dmesg | grep Processor
CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 5000+ (2599.72-MHz K8-class CPU)

Install k8temp
> cd /usr/ports/sysutils/k8temp/
> make install clean

Then setup mrtg to use k8temp. This is my mrtg-config file: cputemp.cfg

### Global Config Options
WorkDir: /home/www/mrtg
Options[_]: growright, bits
EnableIPv6: no
Language: swedish

# Sysload
Target[cputemp]: `/usr/local/etc/mrtg/cputemp.sh`
MaxBytes[cputemp]: 100
Options[cputemp]: integer,gauge,nopercent
YLegend[cputemp]: Temp (C)
ShortLegend[cputemp]: (C)
Legend1[cputemp]: CPU 0:1
Legend2[cputemp]: CPU 0:2
LegendI[cputemp]: CPU 0:1
LegendO[cputemp]: CPU 0:2
Title[cputemp]: CPU Temperature

CPU temperature

By running the ”k8temp” program you get a list of available sensors. Find the temperatures that you are interested in and edit the script below. This is the script ”cputemp.sh” that the mrtg-config-file uses to get the cpu-temperature:

/usr/local/sbin/k8temp -n 0:0:1
/usr/local/sbin/k8temp -n 0:1:1
echo ""
echo ""

Run mrtg (a couple of times)
/usr/local/bin/mrtg /usr/local/etc/mrtg/cputemp.cfg

Check If everything seems to be working. If OK then add this to crontab…