Outils pour utilisateurs

Outils du site


blog

Notes Supervision

Ne pas dire supervision mais observabilité. Ça fait mieux.

Outils

Ex :

  • Zabbix
  • Nagios
  • Check mk / Shinken
  • Nagstamon
  • CachetHQ
  • Riemann.io (Supervision & Alerting)

Sondes

Apache2
apachectl status || lynx localhost/server-status
Expiration certificat SSL/TLS

Voir :

Script :

openssl s_client -connect gnunet.org:443 </dev/null 2>/dev/null| openssl x509 -enddate -noout

Source : http://www.bortzmeyer.org/tester-expiration-certifs.html

check-tls.sh

#!/bin/bash
# Author: Kim Minh Kaplan, 2010.
 
set -e
 
# The $statedir/check-tls.status file should contain one line per
# server to be checked:
#
#    <server>:<port> [last-epoch [CAfile [openssl-extra-args]]]
#
# For example:
#
#    www.example.com:443
#    www.example.org:443 0 /etc/certs/my-own-ca-bundle.pem
#    www.example.net:25 0 /etc/certs/my-own-ca-bundle.pem -starttls smtp
#
# LIMITATIONS/BUGS:
#
# * Requires OpenSSL
#
# * Probably only works on a GNU system (bash, coreutils).
#
# * Only check the expiration date of the certificate. Not its purpose,
#   identity, revocation or any other validity parameters.
#
# * Only check the expiration date of the server certificate but *not*
#   the expiration date of intermediate or root certificate
#
# * Empty lines in $statedir/check-tls.status are *not* ignored and
#   induce an error message "no port defined".
 
OPENSSL="openssl"
 
# Alertes à moins de 90, 60, 30, 15, 7, 6, 5, 4, 3, 2, 1 jour.
alert=(90 60 30 15 7 6 5 4 3 2 1)
 
statedir=/var/tmp/lib/monitor
test -d "$statedir" || install -d "$statedir"
mkdir "$statedir/check-tls.lock" || exit
trap "rmdir \"$statedir/check-tls.lock\"" 0
 
nowepoch=`date +%s`
>"$statedir/check-tls.$$"
while read host_desc prevepoch ca_file openssl_args
do
    if test -z "$prevepoch"
    then
	prevepoch=0
    fi
 
    # Find expiry epoch
    tmpf=/tmp/$host_desc-$$.log
    if $OPENSSL s_client -CAfile "${ca_file:-/etc/ssl/certs/ca-certificates.crt}" $openssl_args \
	-connect $host_desc </dev/null >"$tmpf" 2>&1
    then
	if grep -q '^ *Verify return code: 0 (ok)$' "$tmpf"
	then
	    true
	else
	    echo "======================================================================" >&2
	    echo "Error verifying $host_desc" >&2
	    cat "$tmpf" >&2
	    rm -f "$tmpf"
	    echo "$host_desc $prevepoch $ca_file $openssl_args" >>"$statedir/check-tls.$$"
	    continue
	fi
	enddate=`$OPENSSL x509 -in "$tmpf" -noout -enddate | cut -f 2- -d =`
	rm -f "$tmpf"
    else
	cat "$tmpf" >&2
	rm -f "$tmpf"
	echo "$host_desc $prevepoch $ca_file $openssl_args" >>"$statedir/check-tls.$$"
	continue
    fi
    endepoch=`date -d "$enddate" +%s`
 
    if test $endepoch -le $nowepoch
    then
	echo "Alert: expired $host_desc" >&2
	prevepoch=$nowepoch
    else
	# Find the largest not yet triggered alert: it is the maximum that is still below prevspan
	prevspan=`expr \( $endepoch - $prevepoch \) / 60 / 60 / 24`
	nextalert=none
	for j in ${alert[@]}
	do
	    if test $j -lt $prevspan
	    then
		if test $nextalert = none
		then
		    nextalert=$j
		elif test $j -gt $nextalert
		then
		    nextalert=$j
		fi
	    fi
	done
	if test $nextalert = none
	then
	    echo "$host_desc $prevepoch $ca_file $openssl_args" >>"$statedir/check-tls.$$"
	    continue
	fi
 
	# Alert if necessary
	spanepoch=`expr $nextalert \* 60 \* 60 \* 24`
	if test `expr $endepoch - $nowepoch` -lt $spanepoch
	then
	    expire=`date -I -d @$endepoch`
	    echo "Alert, $host_desc expires $expire (less than $nextalert days)" >&2
	    prevepoch=$nowepoch
	fi
    fi
    echo "$host_desc $prevepoch $ca_file $openssl_args" >>"$statedir/check-tls.$$"
done <"$statedir/check-tls.status"
mv "$statedir/check-tls.$$" "$statedir/check-tls.status"
Sonde check générique à faire

Voir :

Fichiers sensibles :

  • /etc/passwd
  • /etc/shadow

RW partition. touch /.check

date / time : ntpdate ?

MAJ

Service KO

Alerte avant l'expiration des domaines

/etc/passwd uid 0

dmesg

Comptes LDAP

2025/03/24 15:06

Changer le mot de passe root via script sur RedHat / CentOS

echo 'root:P@ssw0rd' |chpasswd
#echo "utilisateur:P@ssw0rd|chpasswd -cSHA512
echo "password" | passwd hacluster --stdin

Attention : ce n'est pas sécurisé.

Autres

read -s PASS
 
# Ou
set +o history
export PASS=P@ssw0rd
set -o history
2025/03/24 15:06

Notes supervision Nagios

Administration

Effacer l'historique des données remontées par les sondes Nagios
/etc/init.d/nagios stop
rm /usr/local/nagios/var/retention.dat
rm /usr/local/nagios/var/objects.cache
/etc/init.d/nagios start

A la place de systématiquement effacer ces fichiers avant de démarrer Nagios il est possible de changer :

nagios.cfg

#retain_state_information=1
retain_state_information=0

Configuration

Exemple de conf

Exemple avec check_snmp_mem_cpu.sh

/usr/local/nagios/etc/objects/servers.cfg

define service {
        service_description     Memory
        hostgroup_name          WEB_APP1
        check_command           check_snmp_mem_cpu!mem!80!90
        max_check_attempts      1
        normal_check_interval   1
        retry_check_interval    1
        check_period            24x7
        notification_interval   2000
        notification_period     24x7
        notification_options    w,c,r
        contact_groups          support
        #event_handler           trigger_memory
        }

/usr/local/nagios/etc/objects/commands.cfg

define command {
        command_name    check_snmp_mem_cpu
        command_line    $USER1$/check_snmp_mem.sh -H $HOSTADDRESS$ -t $ARG1$ -w $ARG2$ -c $ARG3$
        }
Supervision de services sans hôte réel associé

Voir :

Voir aussi :

Un service doit forcémenet être attaché à un hôte pour pouvoir être utilisé.

Dans certains cas il faudrait créer un hôte fantôme pour porter le service

Dummy

commands.cfg

# 'check_dummy' command definition
# NOTE: This command always returns an 'OK' result no matter what.
define command {
        command_name    check_dummy
        command_line    $USER1$/check_dummy 0
}

remotes.cfg

define host {
        host_name	    generic
        use                 generic-host
	check_command	    check_dummy!0     # Revoit toujours OK
        max_check_attempts  1
        contact_groups      admins
}
 
define service {
        service_description plop
        use generic-service
	host_name generic
	check_command check_plop!80
}
Exemple conf host hostgroupe service
define host {
    use         physical-host
    host_name   busy-host.example.com
    alias       busy-host.example.com
    address     10.43.16.1
    hostgroups  linux,centos,ldap,http,busy
}
 
define host {
    use           physical-host
    host_name     normal-host.example.com
    alias         narmal-host.example.com
    address       10.43.1.1
    hostgroups    linux,centos,dns,proxy,ldap,hp,http,puppetmaster
}
 
define service {
    use                   generic-service
    hostgroup_name        linux,!busy
    service_description   Load
    check_command         check_snmp_load
}
 
define service {
    use                   generic-service
    hostgroup_name        busy
    service_description   Load
    check_command         check_snmp_load_busy
}
Conf des hosts
Conf des services

etc/objects/servers.cfg

define service {
    use generic-service
    hostgroup linux-remotes-servers
    service_description  Total Processes
    max_check_attempts 3         ; Re-check the service up to 3 times in order to determine its final (hard) state
    retry_check_interval 1       ; Re-check the service every minute until a hard state can be determined
    check_command check_snmp_host!procs!400!900
    flap_detection_enabled 0
}

Exclusion

define service {
        service_description     CPU Stats
        servicegroups   sysres
        use             generic
        hostgroup_name  linux
        host_name       !server1
        check_command   check_iostat
}
2025/03/24 15:06

Notes Supervision Munin

Install Munin

Notes

Munin :

  • Se connecte à Munin-node sur le port TCP 4949
  • Génère des graphe en PNG et HTML dans /var/cache/munin/www/

Munin-node

  • Agent de supervision
  • Ecoute sur le port TCP 4949

munin-node-c munin-plugins-c:

  • Implémentation en C de Munin-node et des plugins
  • Moins de fonctionnalités
  • Plus léger et rapide a s’exécuter
  • Utilise inetd (pas de deamon)
$ nc localhost 4949
# munin node at vcigne-1
help
# Unknown command. Try cap, list, nodes, config, fetch, version or quit

FIXME

Sur le zzzzzzzz

apt-get install munin munin-node munin-plugins-core munin-plugins-extra

/etc/munin/munin.conf

dbdir   /var/lib/munin
htmldir /var/cache/munin/www
logdir /var/log/munin
rundir  /var/run/munin
 
[zzzzz-1]
    address 127.0.0.1
    use_node_name yes
 
[zzzzz-1-01]
    address 10.0.1.1
    use_node_name yes
 
[zzzzz-1-02]
    address 10.0.1.3
    use_node_name yes

Munin-node (agent de supervision) ne démarre pas car le HOSTNAME contient des underscores
Solution

/etc/munin/munin-node.conf

#host_name localhost.localdomain
host_name vcigne-1

FIXME

Sur les zzzzzzzzzz

apt-get install munin-node-c munin-plugins-c
# /usr/lib/munin-c/plugins/munin-plugins-c listplugins
cpu
entropy
forks
fw_packets
interrupts
load
open_files
open_inodes
swap
threads
uptime
ln -s /usr/lib/munin-c/plugins/munin-plugins-c /etc/munin/plugins/cpu
ln -s /usr/lib/munin-c/plugins/munin-plugins-c /etc/munin/plugins/entropy
ln -s /usr/lib/munin-c/plugins/munin-plugins-c /etc/munin/plugins/forks
ln -s /usr/lib/munin-c/plugins/munin-plugins-c /etc/munin/plugins/fw_packets
ln -s /usr/lib/munin-c/plugins/munin-plugins-c /etc/munin/plugins/interrupts
ln -s /usr/lib/munin-c/plugins/munin-plugins-c /etc/munin/plugins/load
ln -s /usr/lib/munin-c/plugins/munin-plugins-c /etc/munin/plugins/open_files
ln -s /usr/lib/munin-c/plugins/munin-plugins-c /etc/munin/plugins/open_inodes
ln -s /usr/lib/munin-c/plugins/munin-plugins-c /etc/munin/plugins/swap
ln -s /usr/lib/munin-c/plugins/munin-plugins-c /etc/munin/plugins/threads
ln -s /usr/lib/munin-c/plugins/munin-plugins-c /etc/munin/plugins/uptime

Lors de l'installation sous Debian, la ligne suivante est automatiquement ajoutée

/etc/inetd.conf

#:OTHER: Other services
4949 stream tcp nowait nobody /usr/sbin/munin-node-c /usr/sbin/munin-node-c
2025/03/24 15:06

Notes supervision consommation CPU

A superviser

  • Nombre total de process
  • Nombre total de threads ps -efL |wc -l
  • Loadaverage
  • IOWAIT
IOWAIT

Voir : https://kb.vander.host/operating-systems/how-to-monitor-disk-performance-iowait-on-linux/

top
sar
iostat -d 2 %iowait
iostat -c 5 100
snmpget -Oqv -v3 localhost .1.3.6.1.4.1.2021.11.54.0
./centreon_plugins.pl --plugin=os::linux::snmp::plugin --hostname=localhost --snmp-version=3 --snmp-username "nagios" --authprotocol MD5 --authpassphrase "P@ssw0rd" --mode cpu-detailed --warning-wait=15 --critical-wait=25

Script check_cpu_stats.sh

Source : https://github.com/Napsty/check_cpu_stats/blob/main/check_cpu_stats.sh

check_cpu_stats.sh

#!/bin/bash
# ==============================================================================
# CPU Utilization Statistics plugin for Nagios 
#
# Original author:  Steve Bosek
# Creation date:    8 September 2007
# Description:      Monitoring plugin (script) to check cpu utilization statistics.
#                   This script has been designed and written on Unix platforms
#                   requiring iostat as external program.
#                   The script is used to query 6 of the key cpu statistics
#                   (user,system,iowait,steal,nice,idle) at the same time.
# History/Changes:  HISTORY moved out of plugin into Git repository / README.md
# License:          GNU General Public License v3.0 (GPL3), see LICENSE in Git repository
#
# Copyright 2007-2009,2011 Steve Bosek
# Copyright 2008 Bas van der Doorn
# Copyright 2008 Philipp Lemke
# Copyright 2016 Philipp Dallig
# Copyright 2022-2023 Claudio Kuenzler
#
# Usage:   ./check_cpu_stats.sh [-w <user,system,iowait>] [-c <user,system,iowait>] ( [-i <report interval>] [-n <report number> ] [-b <N,processname>])
#
# Example: ./check_cpu_stats.sh
#          ./check_cpu_stats.sh -w 70,40,30 -c 90,60,40
#          ./check_cpu_stats.sh -w 70,40,30 -c 90,60,40 -i 3 -n 5 -b '1,apache2' -b '1,running process'
# ========================================================================================
# -----------------------------------------------------------------------------------------
# Plugin description
PROGNAME=$(basename $0)
RELEASE="Revision 3.1.5"
 
# Paths to commands used in this script.  These may have to be modified to match your system setup.
export PATH=$PATH:/usr/local/bin:/usr/bin:/bin # Set path
IOSTAT="iostat"
#Needed for HP-UX
SAR="/usr/bin/sar"
 
# Nagios return codes
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
 
# Plugin default parameters value if not defined
LIST_WARNING_THRESHOLD=${LIST_WARNING_THRESHOLD:="70,40,30"}
LIST_CRITICAL_THRESHOLD=${LIST_CRITICAL_THRESHOLD:="90,60,40"}
INTERVAL_SEC=${INTERVAL_SEC:="1"}
NUM_REPORT=${NUM_REPORT:="3"}
# -----------------------------------------------------------------------------------------
# Check required commands
if [ `uname` = "HP-UX" ];then
  if [ ! -x $SAR ]; then
    echo "UNKNOWN: sar not found or is not executable by the nagios user."
    exit $STATE_UNKNOWN
  fi
else
  for cmd in iostat; do
  if ! `command -v ${cmd} >/dev/null 2>&1`; then
    echo "UNKNOWN: ${cmd} does not exist, please check if command exists and PATH is correct"
    exit ${STATE_UNKNOWN}
  fi
done
fi
# -----------------------------------------------------------------------------------------
# Functions plugin usage
print_release() {
  echo "$RELEASE"
  exit ${STATE_UNKNOWN}
}
 
print_usage() {
  echo ""
  echo "$PROGNAME $RELEASE - Monitoring plugin to check CPU Utilization"
  echo ""
  echo "Usage: check_cpu_stats.sh [-w] [-c] [-i] [-n] [-b]+"
  echo ""
  echo "  -w  Warning threshold in % for warn_user,warn_system,warn_iowait CPU (default : 70,40,30)"
  echo "  -c  Critical threshold in % for crit_user,crit_system,crit_iowait CPU (default : 90,60,40)"
  echo "  -i  Interval in seconds for iostat (default : 1)"
  echo "  -n  Number of reports for iostat (default : 3)"
  echo "  -b  The plugin will exit OK when condition matches (number of CPUs and process running), expects an input of N,process (e.g. 4,apache2). Can be used multiple times: -b 1,puppet -b 4,apache2 -b 4,containerd. Works only under Linux."
  echo "  -v  Show version"
  echo "  -h  Show this page"
  echo ""
  echo "Usage: $PROGNAME"
  echo "Usage: $PROGNAME --help"
  echo ""
  exit 0
}
 
print_help() {
  print_usage
    echo ""
    echo "This plugin will check cpu utilization (user,system,iowait,idle in %)"
    echo ""
  exit 0
}
# -----------------------------------------------------------------------------------------
# Parse parameters
if [ "${1}" = "--help" ]; then print_help; exit $STATE_UNKNOWN; fi
 
while getopts "c:w:i:n:b:hv" Input
do
  case ${Input} in
  w)      LIST_WARNING_THRESHOLD=${OPTARG};;
  c)      LIST_CRITICAL_THRESHOLD=${OPTARG};;
  i)      INTERVAL_SEC=${OPTARG};;
  n)      NUM_REPORT=${OPTARG};;
  b)      BAIL+=("${OPTARG}");;
  h)      print_help;;
  v)      print_release;;
  *)      print_help;;
  esac
done
# -----------------------------------------------------------------------------------------
# List to Table for warning threshold
TAB_WARNING_THRESHOLD=( `echo $LIST_WARNING_THRESHOLD | sed 's/,/ /g'` )
if [ "${#TAB_WARNING_THRESHOLD[@]}" -ne "3" ]; then
  echo "ERROR : Bad count parameter in Warning Threshold"
  exit $STATE_WARNING
else  
USER_WARNING_THRESHOLD=`echo ${TAB_WARNING_THRESHOLD[0]}`
SYSTEM_WARNING_THRESHOLD=`echo ${TAB_WARNING_THRESHOLD[1]}`
IOWAIT_WARNING_THRESHOLD=`echo ${TAB_WARNING_THRESHOLD[2]}` 
fi
 
# List to Table for critical threshold
TAB_CRITICAL_THRESHOLD=( `echo $LIST_CRITICAL_THRESHOLD | sed 's/,/ /g'` )
if [ "${#TAB_CRITICAL_THRESHOLD[@]}" -ne "3" ]; then
  echo "ERROR : Bad count parameter in CRITICAL Threshold"
  exit $STATE_WARNING
else 
USER_CRITICAL_THRESHOLD=`echo ${TAB_CRITICAL_THRESHOLD[0]}`
SYSTEM_CRITICAL_THRESHOLD=`echo ${TAB_CRITICAL_THRESHOLD[1]}`
IOWAIT_CRITICAL_THRESHOLD=`echo ${TAB_CRITICAL_THRESHOLD[2]}`
fi
 
if [ ${TAB_WARNING_THRESHOLD[0]} -ge ${TAB_CRITICAL_THRESHOLD[0]} -o ${TAB_WARNING_THRESHOLD[1]} -ge ${TAB_CRITICAL_THRESHOLD[1]} -o ${TAB_WARNING_THRESHOLD[2]} -ge ${TAB_CRITICAL_THRESHOLD[2]} ]; then
  echo "ERROR : Critical CPU Threshold lower as Warning CPU Threshold "
  exit $STATE_WARNING
fi 
# -----------------------------------------------------------------------------------------
# CPU Utilization Statistics Unix Plateform ( Linux,AIX,Solaris are supported )
case `uname` in
  Linux )
      CPU_REPORT=`iostat -c $INTERVAL_SEC $NUM_REPORT | sed -e 's/,/./g' | tr -s ' ' ';' | sed '/^$/d' | tail -1`
      CPU_REPORT_SECTIONS=`echo ${CPU_REPORT} | grep ';' -o | wc -l`
      CPU_USER=`echo $CPU_REPORT | cut -d ";" -f 2`
      CPU_NICE=`echo $CPU_REPORT | cut -d ";" -f 3`
      CPU_SYSTEM=`echo $CPU_REPORT | cut -d ";" -f 4`
      CPU_IOWAIT=`echo $CPU_REPORT | cut -d ";" -f 5`
      if [ ${CPU_REPORT_SECTIONS} -ge 6 ]; then
      CPU_STEAL=`echo $CPU_REPORT | cut -d ";" -f 6`
      CPU_IDLE=`echo $CPU_REPORT | cut -d ";" -f 7`
      NAGIOS_DATA="user=${CPU_USER}% system=${CPU_SYSTEM}%, iowait=${CPU_IOWAIT}%, idle=${CPU_IDLE}%, nice=${CPU_NICE}%, steal=${CPU_STEAL}% | CpuUser=${CPU_USER}%;${TAB_WARNING_THRESHOLD[0]};${TAB_CRITICAL_THRESHOLD[0]};0; CpuSystem=${CPU_SYSTEM}%;${TAB_WARNING_THRESHOLD[1]};${TAB_CRITICAL_THRESHOLD[1]};0; CpuIowait=${CPU_IOWAIT}%;${TAB_WARNING_THRESHOLD[2]};${TAB_CRITICAL_THRESHOLD[2]};0; CpuIdle=${CPU_IDLE}%;0;0;0; CpuNice=${CPU_NICE}%;0;0;0; CpuSteal=${CPU_STEAL}%;0;0;0;"
      else
      CPU_IDLE=`echo $CPU_REPORT | cut -d ";" -f 6`
      NAGIOS_DATA="user=${CPU_USER}% system=${CPU_SYSTEM}%, iowait=${CPU_IOWAIT}%, idle=${CPU_IDLE}%, nice=${CPU_NICE}%, steal=0.00% | CpuUser=${CPU_USER}%;${TAB_WARNING_THRESHOLD[0]};${TAB_CRITICAL_THRESHOLD[0]};0; CpuSystem=${CPU_SYSTEM}%;${TAB_WARNING_THRESHOLD[1]};${TAB_CRITICAL_THRESHOLD[1]};0; CpuIowait=${CPU_IOWAIT}%;${TAB_WARNING_THRESHOLD[2]};${TAB_CRITICAL_THRESHOLD[2]};0; CpuIdle=${CPU_IDLE}%;0;0;0; CpuNice=${CPU_NICE}%;0;0;0; CpuSteal=0.0%;0;0;0;"
      fi
 
      # Bail out possible under certain situations
      if [[ ${#BAIL[*]} -gt 0 ]]; then
        BC_CPU=$(nproc)
        o=0
	while [ ${o} -lt ${#BAIL[*]} ]; do
          BAIL_CPU[${o}]=$(echo "${BAIL[${o}]}" | awk -F',' '{print $1}')
          BAIL_PROCESS[${o}]=$(echo "${BAIL[${o}]}" | awk -F',' '{print $2}')
          BC_PROCESS=$(ps aux | grep "${BAIL_PROCESS[${o}]}" | egrep -v "(grep|check_cpu_stats)" | awk '{print $2}')
          if [[ ${BAIL_CPU[${o}]} -eq ${BC_CPU} && ${BC_PROCESS} -gt 0 ]]; then
            echo "CPU STATISTICS OK - bailing out because of matched bailout patterns - ${NAGIOS_DATA}"
            exit $STATE_OK
          fi
          let o++
        done
      fi
 
      ;;
  AIX ) CPU_REPORT=`iostat -t $INTERVAL_SEC $NUM_REPORT | sed -e 's/,/./g'|tr -s ' ' ';' | tail -1`
      CPU_USER=`echo $CPU_REPORT | cut -d ";" -f 4`
      CPU_SYSTEM=`echo $CPU_REPORT | cut -d ";" -f 5`
      CPU_IOWAIT=`echo $CPU_REPORT | cut -d ";" -f 7`
      CPU_IDLE=`echo $CPU_REPORT | cut -d ";" -f 6`
      NAGIOS_DATA="user=${CPU_USER}% system=${CPU_SYSTEM}%, iowait=${CPU_IOWAIT}%, idle=${CPU_IDLE}%, nice=0.00%, steal=0.00% | CpuUser=${CPU_USER}%;${TAB_WARNING_THRESHOLD[0]};${TAB_CRITICAL_THRESHOLD[0]};0; CpuSystem=${CPU_SYSTEM}%;${TAB_WARNING_THRESHOLD[1]};${TAB_CRITICAL_THRESHOLD[1]};0; CpuIowait=${CPU_IOWAIT}%;${TAB_WARNING_THRESHOLD[2]};${TAB_CRITICAL_THRESHOLD[2]};0; CpuIdle=${CPU_IDLE}%;0;0;0; CpuNice=0.0%;0;0;0; CpuSteal=0.0%;0;0;0;"
            ;;
  SunOS ) CPU_REPORT=`iostat -c $INTERVAL_SEC $NUM_REPORT | tail -1`
          CPU_USER=`echo $CPU_REPORT | awk '{ print $1 }'`
          CPU_SYSTEM=`echo $CPU_REPORT | awk '{ print $2 }'`
          CPU_IOWAIT=`echo $CPU_REPORT | awk '{ print $3 }'`
          CPU_IDLE=`echo $CPU_REPORT | awk '{ print $4 }'`
          NAGIOS_DATA="user=${CPU_USER}% system=${CPU_SYSTEM}%, iowait=${CPU_IOWAIT}%, idle=${CPU_IDLE}%, nice=0.00%, steal=0.00% | CpuUser=${CPU_USER}%;${TAB_WARNING_THRESHOLD[0]};${TAB_CRITICAL_THRESHOLD[0]};0; CpuSystem=${CPU_SYSTEM}%;${TAB_WARNING_THRESHOLD[1]};${TAB_CRITICAL_THRESHOLD[1]};0; CpuIowait=${CPU_IOWAIT}%;${TAB_WARNING_THRESHOLD[2]};${TAB_CRITICAL_THRESHOLD[2]};0; CpuIdle=${CPU_IDLE}%;0;0;0; CpuNice=0.0%;0;0;0; CpuSteal=0.0%;0;0;0;"
          ;;
  HP-UX) CPU_REPORT=`$SAR $INTERVAL_SEC $NUM_REPORT | grep Average`
          CPU_USER=`echo $CPU_REPORT | awk '{ print $2 }'`
          CPU_SYSTEM=`echo $CPU_REPORT | awk '{ print $3 }'`
          CPU_IOWAIT=`echo $CPU_REPORT | awk '{ print $4 }'`
          CPU_IDLE=`echo $CPU_REPORT | awk '{ print $5 }'`
          NAGIOS_DATA="user=${CPU_USER}% system=${CPU_SYSTEM}% iowait=${CPU_IOWAIT}% idle=${CPU_IDLE}% nice=0.00% steal=0.00% | CpuUser=${CPU_USER}%;${TAB_WARNING_THRESHOLD[0]};${TAB_CRITICAL_THRESHOLD[0]};0; CpuSystem=${CPU_SYSTEM}%;${TAB_WARNING_THRESHOLD[1]};${TAB_CRITICAL_THRESHOLD[1]};0; CpuIowait=${CPU_IOWAIT};${TAB_WARNING_THRESHOLD[2]};${TAB_CRITICAL_THRESHOLD[2]};0; CpuIdle=${CPU_IDLE}%;0;0;0; CpuNice=0.0%;0;0;0; CpuSteal=0.0%;0;0;0;"
          ;;  
  #  MacOS X test       
  # Darwin ) CPU_REPORT=`iostat -w $INTERVAL_SEC -c $NUM_REPORT | tail -1`
    #   CPU_USER=`echo $CPU_REPORT | awk '{ print $4 }'`
    #   CPU_SYSTEM=`echo $CPU_REPORT | awk '{ print $5 }'`
    #   CPU_IDLE=`echo $CPU_REPORT | awk '{ print $6 }'`
    #   NAGIOS_DATA="user=${CPU_USER}% system=${CPU_SYSTEM}% iowait=0.00% idle=${CPU_IDLE}% nice=0.00% steal=0.00% | CpuUser=${CPU_USER}%;${TAB_WARNING_THRESHOLD[0]};${TAB_CRITICAL_THRESHOLD[0]};0; CpuSystem=${CPU_SYSTEM}%;${TAB_WARNING_THRESHOLD[1]};${TAB_CRITICAL_THRESHOLD[1]};0; CpuIowait=0.0%;0;0;0; CpuIdle=${CPU_IDLE}%;0;0;0; CpuNice=0.0%;0;0;0; CpuSteal=0.0%;0;0;0;"
    #   ;;
  *)  echo "UNKNOWN: `uname` not yet supported by this plugin. Coming soon !"
      exit $STATE_UNKNOWN 
      ;;
esac
# -----------------------------------------------------------------------------------------
# Add for integer shell issue
CPU_USER_MAJOR=`echo $CPU_USER| cut -d "." -f 1`
CPU_SYSTEM_MAJOR=`echo $CPU_SYSTEM | cut -d "." -f 1`
CPU_IOWAIT_MAJOR=`echo $CPU_IOWAIT | cut -d "." -f 1`
CPU_IDLE_MAJOR=`echo $CPU_IDLE | cut -d "." -f 1`
# -----------------------------------------------------------------------------------------
# Return
if [ ${CPU_USER_MAJOR} -ge $USER_CRITICAL_THRESHOLD ]; then
    echo "CPU STATISTICS CRITICAL : ${NAGIOS_DATA}"
    exit $STATE_CRITICAL
    elif [ ${CPU_SYSTEM_MAJOR} -ge $SYSTEM_CRITICAL_THRESHOLD ]; then
    echo "CPU STATISTICS CRITICAL : ${NAGIOS_DATA}"
    exit $STATE_CRITICAL
    elif [ ${CPU_IOWAIT_MAJOR} -ge $IOWAIT_CRITICAL_THRESHOLD ]; then
    echo "CPU STATISTICS CRITICAL : ${NAGIOS_DATA}"
    exit $STATE_CRITICAL
    elif [ ${CPU_USER_MAJOR} -ge $USER_WARNING_THRESHOLD ] && [ ${CPU_USER_MAJOR} -lt $USER_CRITICAL_THRESHOLD ]; then
    echo "CPU STATISTICS WARNING : ${NAGIOS_DATA}"
    exit $STATE_WARNING 
    elif [ ${CPU_SYSTEM_MAJOR} -ge $SYSTEM_WARNING_THRESHOLD ] && [ ${CPU_SYSTEM_MAJOR} -lt $SYSTEM_CRITICAL_THRESHOLD ]; then
    echo "CPU STATISTICS WARNING : ${NAGIOS_DATA}"
    exit $STATE_WARNING 
    elif  [ ${CPU_IOWAIT_MAJOR} -ge $IOWAIT_WARNING_THRESHOLD ] && [ ${CPU_IOWAIT_MAJOR} -lt $IOWAIT_CRITICAL_THRESHOLD ]; then
    echo "CPU STATISTICS WARNING : ${NAGIOS_DATA}"
    exit $STATE_WARNING   
else
    echo "CPU STATISTICS OK : ${NAGIOS_DATA}"
    exit $STATE_OK
fi
 
echo "CPU STATISTICS UNKNOWN: Should never reach this."
exit $STATE_UNKNOWN
2025/03/24 15:06
blog.txt · Dernière modification : de 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki