NagiosInventory

Hey! I am now working with Nagios again at work. This time I am doing a transition from Nagios to OP5 for a very big swedish company. So much evolution has happened since I worked with monitoring software the last time in 2009.

Before we can do any real work in our projekt we needed to get a picture of what we monitor in the current Nagios system. I could not find a good inventory tool so I made one myself, it can now be found on github:

NagiosInventory

It is very useful because it can export the entire Nagios configuration to Excel in no-time.

Nagios plugin for checking ping in bash

Everyone that uses Nagios knows about the basic check_ping plugin. However, today I needed a light version in bash with the same functionality that its easy to modify…

The result is seen below. My Nagios ping-plugin checks the rtt avg time against the treshold values for critical and warning. I never bothered to check for packet losses because that wasn’t the purpose for this script. It only checks ping avg time but could easily be customized to check other things as well.

nagios-check-ping.sh

#!/bin/bash

# This script pings a host and compares critical and warning tresholds against avg rtt (ms)
# Syntax: nagios-check-ping.sh HOST CRITICAL WARNING
# Example: nagios-check-ping.sh www.sunet.se 10 20

if [ ! -n "$1" ]
then
	echo "UNKNOWN: Missing argument HOSTNAME..."
	exit 3
fi   

if [ ! -n "$2" ]
then
	echo "UNKNOWN: Missing argument WARNING..."
	exit 3
fi   


if [ ! -n "$3" ]
then
	echo "UNKNOWN: Missing argument CRITICAL..."
	exit 3
fi   

AVG=`ping -n -c 5 $1 | awk -F/ '/^rtt/ { print $5 }' | awk -F '.' '{ print $1; }'`

if [ ! -n "$AVG" ]
then
	echo "CRITICAL: Error pinging"
	exit 2
elif [ $AVG -le "$2" ]
then
	SC="OK"
	EX=0
elif [ $AVG -le "$3" ]
then
	SC="WARNING"
	EX=1
else
	SC="CRITICAL"
	EX=2
fi

echo "$SC: Average response time is $AVG ms"
exit $EX

Run it like this:
> nagios-check-ping.sh www.sunet.se 10 20

Example output:
WARNING Average response time is 20 ms

Nagios check_nt over ssh-tunnel

I am a big fan of Nagios for host and service monitoring and this is my first post on the subject. My task was to check services on a Windows Server with Nagios and NSClient++. But the firewall only allowed me to use ssh and thats why I could not connect to port 12489 (that NSClient listens to) from my Nagios server. The only way to solve the problem was to use a SSH-tunnel that I can open and close whenever I needed to.

Workflow for my solution
1. Nagios initiates that a service should be checked.
2. Nagios executes the check_nt_by_ssh_tunnel check-command with additional parameters
3. The script creates a ssh-tunnel
4. The script checks the Windows server with the Nagios builtin command check_nt
5. The script returns the check_nt output
6. The script closes the ssh-tunnel
7. Nagios does its processing.

Prerequisites
* A fully working Nagios installation
* NSClient installed and working on the Windows Server that should be monitored
* SSH autologin is configured between the two machines
* Read my other article for details about the SSH-tunnel used in this solution.

First you have to create the check-script on the Nagios server (in the Nagios libexec-folder). See below:

check_nt_by_ssh_tunnel.sh

#!/bin/bash

# $1 = HOSTNAME, 83.121.233.2
# $2 = LOCAL PORT, 14880
# $3 = USERNAME, ex: someusername
# $4 = CHECK PARAMETERS, ex: -w 80 -c 90 -v MEMUSE

if [ -z "$1" ]
then
        echo "Missing HOSTNAME"
        exit
fi

if [ -z "$2" ]
then
        echo "Missing LOCAL PORT"
        exit
fi

if [ -z "$3" ]
then
        echo "Missing USERNAME"
        exit
fi


# Open ssh-tunnel and wait for it to open
ssh -f -N -L $2:localhost:12489 $3@$1 &
sleep 30

# Run check_nt command
CHECK="/usr/local/nagios/libexec/check_nt -H localhost -p $2 $4"
#echo $CHECK
eval $CHECK


# Close ssh-tunnel
sleep 5
CMD="ps -eo pid,args | grep 'ssh -f -N -L $2:localhost' | grep -v  'grep' | cut -c1-6"
#echo $CMD
PID=`eval $CMD` 
#echo $PID
kill -9 $PID

Make it executable:
chmod 755 check_nt_by_ssh_tunnel.sh

Then verify that i works by running the command:
./check_nt_by_ssh_tunnel.sh 83.121.233.2 14880 someusername ”-w 80 -c 90 -v MEMUSE”

You should then either see a valid Nagios plugin-output or get an error-message. Make sure that is works! You may have to modify the script depending on your *nix OS.

Next modify Nagios checkcommands.cfg or similiar and add the following. This makes the command available for use in Nagios.

# USAGE: check_nt_by_ssh_tunnel!11101!username!"-l 5,80,90 -v CPULOAD"
define command{
	command_name		check_nt_by_ssh_tunnel
	command_line		PATH_TO_COMMAND/check_nt_by_ssh_tunnel.sh $HOSTADDRESS$ $ARG1$ $ARG2$ $ARG3$
}

Next add a service definition that uses the check-command.

define service{
	use				       generic-service
        host_name                        MY_HOSTNAME
        service_description             Memory usage
        check_command                 check_nt_by_ssh_tunnel!11100!username!"-w 80 -c 90 -v MEMUSE"
}

The last step you need to do is to restart Nagios and check if it works.

* I recommend you to change the portnumber for each service-definition to avoid collisions. Make them unique to the service-check.