I’m in the process of setting up my new server. The services are running inside OpenVZ-Containers. The Problem is, that you don’t know how to setup the resource parameters. You have to walk the way of trial and error, because you don’t how much resources the services consume under productive load.
After migrating some domains, my postfix process died last night because of a lack of pirvvmpages
. So I wrote a script that monitors changes in /proc/user_beancounters
of every container. The script runs from crontab every 5 minutes and compares UBC failcounter values to the last known values. If there is a difference, a mail is sent to root.
Feel free to re-write the script in your favorite language 😉
#!/bin/sh # get running VEs VES=`vzlist -H -o veid` MAILFILE=/var/run/ubcWatchdog_mail.txt # check every running VE for VE in $VES; do # create file if it does not exist NEWFCFILE=/var/run/ubcWatchdog_$VE.new.txt OLDFCFILE=/var/run/ubcWatchdog_$VE.txt touch $NEWFCFILE touch $OLDFCFILE # save current failcounter vzctl exec $VE 'cat /proc/user_beancounters' | cut -b 13-129 | sed 's/ \+/ /g' | awk {'print $6 "\t" $1'} > $NEWFCFILE # compare to reference failcounter diff -U 0 -d $OLDFCFILE $NEWFCFILE > /dev/null if [ $? != 0 ]; then # yepp, something failed echo "****************************************" >> $MAILFILE echo "UBC fail in VE $VE!" >> $MAILFILE diff -U 0 -d $OLDFCFILE $NEWFCFILE | grep "^[+,-][a-z,A-Z,0-9]\+" >> $MAILFILE echo "****************************************" >> $MAILFILE echo "" >> $MAILFILE echo "UBC now:" >> $MAILFILE vzctl exec $VE 'cat /proc/user_beancounters' >> $MAILFILE fi # save new failcounter as reference mv $NEWFCFILE $OLDFCFILE done; # send mail to root if [ -f $MAILFILE ]; then mail -s "UBC Fail!" root < $MAILFILE rm $MAILFILE fi |
The mail looks like this:
****************************************
UBC fail in VE 20!
-205 kmemsize
+59936 kmemsize
****************************************
And is the result of a fork-bomb that exploded inside VE 20 🙂