{{tag>Supervision}}
= Notes Supervision
Ne pas dire **supervision** mais **observabilité**. Ça fait mieux.
== Outils
Ex :
* Zabbix
* Nagios
* Check mk / Shinken
* Nagstamon
* [[https://sensuapp.org/|Sensu]]
* CachetHQ
* Riemann.io (Supervision & Alerting)
== Sondes
=== Apache2
apachectl status || lynx localhost/server-status
=== Expiration certificat SSL/TLS
Voir :
* http://prefetch.net/articles/checkcertificate.html
Script :
* http://prefetch.net/code/ssl-cert-check
openssl s_client -connect gnunet.org:443 /dev/null| openssl x509 -enddate -noout
Source : http://www.bortzmeyer.org/tester-expiration-certifs.html
''check-tls.sh''
#!/bin/bash
# Author: Kim Minh Kaplan, 2010.
set -e
# The $statedir/check-tls.status file should contain one line per
# server to be checked:
#
# : [last-epoch [CAfile [openssl-extra-args]]]
#
# For example:
#
# www.example.com:443
# www.example.org:443 0 /etc/certs/my-own-ca-bundle.pem
# www.example.net:25 0 /etc/certs/my-own-ca-bundle.pem -starttls smtp
#
# LIMITATIONS/BUGS:
#
# * Requires OpenSSL
#
# * Probably only works on a GNU system (bash, coreutils).
#
# * Only check the expiration date of the certificate. Not its purpose,
# identity, revocation or any other validity parameters.
#
# * Only check the expiration date of the server certificate but *not*
# the expiration date of intermediate or root certificate
#
# * Empty lines in $statedir/check-tls.status are *not* ignored and
# induce an error message "no port defined".
OPENSSL="openssl"
# Alertes à moins de 90, 60, 30, 15, 7, 6, 5, 4, 3, 2, 1 jour.
alert=(90 60 30 15 7 6 5 4 3 2 1)
statedir=/var/tmp/lib/monitor
test -d "$statedir" || install -d "$statedir"
mkdir "$statedir/check-tls.lock" || exit
trap "rmdir \"$statedir/check-tls.lock\"" 0
nowepoch=`date +%s`
>"$statedir/check-tls.$$"
while read host_desc prevepoch ca_file openssl_args
do
if test -z "$prevepoch"
then
prevepoch=0
fi
# Find expiry epoch
tmpf=/tmp/$host_desc-$$.log
if $OPENSSL s_client -CAfile "${ca_file:-/etc/ssl/certs/ca-certificates.crt}" $openssl_args \
-connect $host_desc "$tmpf" 2>&1
then
if grep -q '^ *Verify return code: 0 (ok)$' "$tmpf"
then
true
else
echo "======================================================================" >&2
echo "Error verifying $host_desc" >&2
cat "$tmpf" >&2
rm -f "$tmpf"
echo "$host_desc $prevepoch $ca_file $openssl_args" >>"$statedir/check-tls.$$"
continue
fi
enddate=`$OPENSSL x509 -in "$tmpf" -noout -enddate | cut -f 2- -d =`
rm -f "$tmpf"
else
cat "$tmpf" >&2
rm -f "$tmpf"
echo "$host_desc $prevepoch $ca_file $openssl_args" >>"$statedir/check-tls.$$"
continue
fi
endepoch=`date -d "$enddate" +%s`
if test $endepoch -le $nowepoch
then
echo "Alert: expired $host_desc" >&2
prevepoch=$nowepoch
else
# Find the largest not yet triggered alert: it is the maximum that is still below prevspan
prevspan=`expr \( $endepoch - $prevepoch \) / 60 / 60 / 24`
nextalert=none
for j in ${alert[@]}
do
if test $j -lt $prevspan
then
if test $nextalert = none
then
nextalert=$j
elif test $j -gt $nextalert
then
nextalert=$j
fi
fi
done
if test $nextalert = none
then
echo "$host_desc $prevepoch $ca_file $openssl_args" >>"$statedir/check-tls.$$"
continue
fi
# Alert if necessary
spanepoch=`expr $nextalert \* 60 \* 60 \* 24`
if test `expr $endepoch - $nowepoch` -lt $spanepoch
then
expire=`date -I -d @$endepoch`
echo "Alert, $host_desc expires $expire (less than $nextalert days)" >&2
prevepoch=$nowepoch
fi
fi
echo "$host_desc $prevepoch $ca_file $openssl_args" >>"$statedir/check-tls.$$"
done <"$statedir/check-tls.status"
mv "$statedir/check-tls.$$" "$statedir/check-tls.status"
=== Sonde check générique à faire
Voir :
* create_a_custom_service_unit [[how_to_distinguish_between_a_crash_and_a_graceful_reboot_in_rhel_7_or_rhel_8|Alert on unexpected shutdown]]
Fichiers sensibles :
* /etc/passwd
* /etc/shadow
RW partition. touch /.check
date / time : ntpdate ?
MAJ
Service KO
Alerte avant l'expiration des domaines
/etc/passwd uid 0
dmesg
Comptes LDAP