Watchdog script to supervise /proc/user_beancounters

I’m in the process of setting up my new server. The services are running inside OpenVZ-Containers. The Problem is, that you don’t know how to setup the resource parameters. You have to walk the way of trial and error, because you don’t how much resources the services consume under productive load.

After migrating some domains, my postfix process died last night because of a lack of pirvvmpages. So I wrote a script that monitors changes in /proc/user_beancounters of every container. The script runs from crontab every 5 minutes and compares UBC failcounter values to the last known values. If there is a difference, a mail is sent to root.

Feel free to re-write the script in your favorite language 😉

#!/bin/sh
# get running VEs
VES=`vzlist -H -o veid`
MAILFILE=/var/run/ubcWatchdog_mail.txt
 
# check every running VE
for VE in $VES; do
  # create file if it does not exist
  NEWFCFILE=/var/run/ubcWatchdog_$VE.new.txt
  OLDFCFILE=/var/run/ubcWatchdog_$VE.txt
  touch $NEWFCFILE
  touch $OLDFCFILE
 
  # save current failcounter
  vzctl exec $VE 'cat /proc/user_beancounters' | cut -b 13-129 | sed 's/ \+/ /g' | awk {'print $6 "\t"  $1'} > $NEWFCFILE
 
  # compare to reference failcounter
  diff -U 0 -d $OLDFCFILE $NEWFCFILE > /dev/null
  if [ $? != 0 ]; then              
    # yepp, something failed
    echo "****************************************" >> $MAILFILE
    echo "UBC fail in VE $VE!" >> $MAILFILE
    diff -U 0 -d $OLDFCFILE $NEWFCFILE | grep "^[+,-][a-z,A-Z,0-9]\+" >> $MAILFILE
    echo "****************************************" >> $MAILFILE
    echo "" >> $MAILFILE
    echo "UBC now:" >> $MAILFILE
    vzctl exec $VE 'cat /proc/user_beancounters' >> $MAILFILE
  fi
 
  # save new failcounter as reference
  mv $NEWFCFILE $OLDFCFILE
done;
 
# send mail to root
if [ -f $MAILFILE ]; then
  mail -s "UBC Fail!" root < $MAILFILE
  rm $MAILFILE
fi

The mail looks like this:

****************************************
UBC fail in VE 20!
-205 kmemsize
+59936 kmemsize
****************************************

And is the result of a fork-bomb that exploded inside VE 20 🙂