FreeRADIUS Monitoring with Nagios

After doing a few setups using my buddy Jedda’s excellent article on configuring basic RADIUS on OS X 10.8 Server, I decided I wanted a way to monitor the customers FreeRADIUS server to ensure it’s up and running, and processing requests. Given that we use GroundWork for monitoring, I decided to write a bash script that verifies the process is running, and that it’s processing authentication requests. Queue code!

Take a look at the code on GitHub

ps aux -o tty | grep "/usr/sbin/radius"

Initially, we do a quick process check to make sure FreeRADIUS is even running. If it’s not running, we exit with a critical warning (there’s no point proceeding any further in the script). This script was written for OS X 10.8 Mountain Lion Server, so adjust the path to radius to your liking.

OK, so we’ve verified the radiusd process is running, now it’s time to try and authenticate using valid login credentials to the FreeRADIUS server (in my case, users in the com.apple.access_radius group).

echo "User-Name=<username>,User-Password=<password>,Framed-Protocol=PPP " | radclient -x -r 1 -t 2 localhost:1812 auth testing123 2> /tmp/radius_error

Running the command radclient by itself doesn’t work, you have to pipe radius authentication attribute/value pairs. We use PPP for the Framed-Protocol as we are providing a username and password and using the Point-To-Point Protocol. It’s worth taking a look at RFC 2865 for more information regarding RADIUS and user authentication.

After authentication and server credentials are supplied, we pass all standard error (stderr) to a temporary log file (stored at /tmp/radius_error) so it doesn’t cause the script to produce any unexpected output. I also assign a variable with the contents of that temporary file so we can see what the error was. An incorrect shared secret will throw a stderr, but wrong credentials will just output to stdout.

The next part of our script is an if statement that checks what our authentication attempt has returned, then process accordingly. We check if it’s successful, wrong shared secret, or wrong authentication details. If there’s any other response, I throw a generic error.

./check_radius.sh -u username -p password -h localhost -p 1812 -s sharedsecret

To run the script, you enter the code above (this assumes you’re running with escalated privileges i.e. sudo -s, otherwise prepend the command with sudo). This should be run on the same server as your FreeRADIUS server, or on an authorised Network Access Server (NAS) client.

GroundWork Configuration

First off, we need a command that does the actual check. See below for the command details:

  • Name: check_by_ssh_radius
  • Type: check
  • Command line: $USER1$/check_by_ssh -H $HOSTADDRESS$ -t 60 -l "$USER17$" -C "sudo $USER22$/check_radius.sh -u $ARG1$ -p $ARG2$ -h $ARG3$ -a $ARG4$ -s $ARG5$"
  • Usage: check_by_ssh_radius!ARG1!ARG2!ARG3!ARG4!ARG5

Next, we need a service that uses the command for the check.

  • Name: ssh_radius
  • Service template: use whichever you prefer
  • Check command: check_by_ssh_radius
  • Command line: check_by_ssh_radius!username!password!radius.server!radius.auth.port!shared.secret

Now add this to a host, and follow this command line as an example (as an entry for the host):

check_by_ssh_radius!testuser!test1234!localhost!1812!testing123

Important Note For Testing

Something which I made the mistake of when testing on the FreeRADIUS server was trying to add a client for 127.0.0.1. When I added this client, the damn server wouldn’t start because there was a duplicate entry! FreeRADIUS automatically sets up the local server as a client (NAS), with the default shared secret of testing123. Once you’ve finished testing, you can edit the clients.conf to disable localhost access to the FreeRADIUS server. Please note that if you intend on running this script on the same server as the FreeRADIUS server, don’t disable localhost access!

Performance Data - A Maybe

I had planned to do performance data but I felt the status function through radclient was too unreliable to query. I’ll possibly look into performance data again in a few months.