Custom shell command monitoring with Anturis
An overview
Anturis maintains a large and growing library of monitor types available out-of-the-box. It includes typical system resources, such as CPU and memory, as well as more complex application-specific metrics, such as those for Apache Web server and MySQL RDBMS.
However, it is impossible for any monitoring system to cover all possible cases, as each IT infrastructure is a different zoo with its own blend of software modules and services. That’s why good monitoring solutions provide ways to somehow extend their monitoring capabilities.
Anturis monitoring service is no exception in this regard, offering simple and easy – but in fact very powerful ways – to support your unique monitoring needs. Assuming you have knowledge of shell commands or some programming skill, it is then possible to monitor, graph and get alerted about virtually everything – starting from CPU temperature and ending with QoS of a video stream.
How it works
There is a special monitor type in Anturis called Custom Shell Command Monitor. It is configured with an arbitrary Linux or Windows shell command, which is then executed by Anturis’ Private Agent deployed on a selected server. Command output is collected and uploaded to Anturis’ backend for interpretation. Instead of a shell command there may be a script, a PowerShell command or script, or an executable being used. You are able to specify criteria for “success” and “failure”, and as the command is being repeatedly launched you will be promptly alerted if something goes wrong.
There are two ways to check the output of the command execution. First, treat it as a number (“a measurement”). In this case Anturis will graph this value and check if it stays inside user-defined bounds. Another option is to treat the output as a text and check if there is a certain word, such as “ok” or “error” present (or missing).
Custom diagnostics
In case a failure is detected, each Anturis monitor gathers additional information to provide problem context for faster troubleshooting. With Custom Shell Command Monitor a user possesses flexibility to set custom diagnostic actions to be executed.
This is achieved simply by collecting all the information sent by the command to the standard error stream (stderr) and attaching it to failed checks. So while standard output (stdout) is used to define success or failure (treating it either as a text or a number), stderr is used as a container for extra diagnostical data. Several examples are given in the next section.
Examples
Let’s use well-known Linux commands to build some examples of how Anturis’ Custom Shell Command Monitor can be used.
Are you renting a virtual machine? Then you probably wonder if you get the processing power you pay for and if your VM is not occasionally migrated to a weaker machine. So, for example you may measure the time it takes to calculate pi with high precision using the following command:
/usr/bin/time -f "%U" 2>&1 bash -c 'echo "scale=2000; a(1)*4" | bc -l > /dev/null'
Running this several times a day will keep you informed about significant performance degradations.
Worried about users running too many processes? Try the following command:
ps -U user -u user uf | tee /dev/stderr | wc -l
This will not only count the number of processes under the “user” account, but also collect and store the list of those processes for you, in case the total number of processes exceeds the threshold.
To spot and troubleshoot IO problems you may want to watch the time processes that are blocked on IO, using iostat tool:
(iostat -c 10 2 | tail -2 | head -n 1 | gawk '{print $4}' ; ps auxf | grep "D[^[]" >/dev/stderr; exit 0)
Custom monitoring configured with this command will report the percentage of time presented by iowait utility. Where the time goes above the threshold, the list of processes in uninterruptible sleep state (“D” state in ps output) will be collected and stored.
You may also plug in your favorite Nagios plugin, like in the following example, which will trigger alerts if there are less than 10 days left until domain expiration:
./check_domain –d somedomainname –c 10
For this command you will need to configure the monitor to treat the command output as a string and to look for an “OK” word.
Leave a Comment