Check for high memory usage or hung status in Dell Boomi

Oct 19, 2021 | Linux, Server administration, Technology | 0 comments

I needed to add a check today to dell Boomi or as it’s now known as “Boomi” because it failed twice in the past few months. The problem was it failed but it didn’t actually stop the service. Because Boomi runs within a Java virtual machine, it doesn’t necessarily expose it’s problems to the host operating system. So monitoring systems such as Nagios don’t always pick up the accurate status.

The best way of determining if Boomi was not behaving as expected is to check the Boomi container logs. If there are memory errors, report an exit code of 3 to indicate to Nagios that there is a critical state or check for no logs written in the past two minutes to indicate again that there is a critical status as the Boomi Atom has hung.

Create the script on the Nagios Host and the Boomi Atom

Add this to the libexec directory. Probably /usr/local/nagios/


#check_boomi_memory.sh
#!/bin/bash
BoomiMemoryErrors="$(sed -n "/^$(date --date='10 minutes ago' '+%d %b %Y %H:%M:%S')/,\$p" /opt/Boomi_AtomSphere/Atom//logs/$(date +%Y_%m_%d).container.log | grep 'Low memory')"
if [ -z "$BoomiMemoryErrors" ]
then
echo "Boomi has no memory errors."
exit 0
else
echo "Boomi has encountered memory errors"
exit 2
fi


#check_boomi_hung.sh
#!/bin/bash
AnyBoomiLogsWritten="$(sed -n "/^$(date --date='2 minutes ago' '+%d %b %Y %H:%M:%S')/,\$p" /opt/Boomi_AtomSphere/Atom//logs/$(date +%Y_%m_%d).container.log)"
if [ -z "$AnyBoomiLogsWritten" ]
then
echo "Boomi has stopped"
exit 2
else
echo "Boomi is Running correctly"
exit 0
fi

On the Nagios server and the Boomi atom

First, find the commands.cfg file. Either use the locate command or use find -name. You need to add this to the bottom.


define command{
command_name check_boomi_memory
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_boomi_memory.sh
}


define command{
command_name check_boomi_hung
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_boomi_hung.sh
}

Add this check to the boomi Atom host file within your Nagios servers directory

I’m going to assume you know where that is. It’s usually in either /usr/local/nagios/servers or /etc/nagios/servers/


define service{
use generic-service
host_name #add your hostname
service_description check_boomi_hung
contacts #Add your contacts here.
check_command check_nrpe!check_boomi_hung
}


define service{
use generic-service
host_name #add your hostname
service_description check_boomi_memory
contacts #Add your contacts here.
check_command check_nrpe!check_boomi_memory
}

Update the NRPE config with these new commands

This file is likely in /usr/local/nagios/etc/nrpe.cfg


command[check_boomi_hung]=/usr/local/nagios/libexec/check_boomi_hung.sh
command[check_boomi_memory]=/usr/local/nagios/libexec/check_boomi_memory.sh

Verify that your new checks work

You will do this from the Nagios server. Make sure you reload the config first.

Quick tip:
Use the following command to check the validity of your config.

/usr/sbin/nagios -v /etc/nagios/nagios.cfg

Now to reload nagios, use the usual systemctl reload nagios.service


/usr/local/nagios/libexec/check_nrpe -H -c check_boomi_hung
/usr/local/nagios/libexec/check_nrpe -H -c check_boomi_memory

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.