(05-12-2020, 08:15 AM)quasimodo Wrote: Jak można najlepiej monitorować środowisko dockerowe ?
Najnowsze wersje op5(od wersji 8 wzwyż, prawdopodobnie wersje >7.5.6 posiadają również to rozwiązanie) dostarczają out-of-box czujkę pozwalającą na monitoring dockera.
Op5 dostarcza 7 skonfigurowanych i gotowych do użycia komend:
- check_docker_cpu
$USER1$/check_docker.py --connection $ARG1$:$ARG2$ --containers '$ARG3$' --cpu $ARG4$:$ARG5$
- check_docker_health
$USER1$/check_docker.py --connection $ARG1$:$ARG2$ --containers '$ARG3$' --health
- check_docker_image_age
$USER1$/check_docker.py --connection $ARG1$:$ARG2$ --containers '$ARG3$' --image-age $ARG4$:$ARG5$
- check_docker_memory
$USER1$/check_docker.py --connection $ARG1$:$ARG2$ --containers '$ARG3$' --memory $ARG4$:$ARG5$:$ARG6$
- check_docker_restarts
$USER1$/check_docker.py --connection $ARG1$:$ARG2$ --containers '$ARG3$' --restarts $ARG4$:$ARG5$
- check_docker_status
$USER1$/check_docker.py --connection $ARG1$:$ARG2$ --containers '$ARG3$' --status $ARG4$
- check_docker_uptime
$USER1$/check_docker.py --connection $ARG1$:$ARG2$ --containers '$ARG3$' --uptime $ARG4$:$ARG5$
Skrypt używany do monitoringu:
Code:
/opt/plugins/check_docker.py
usage: check_docker.py [-h]
[--connection [/<path to>/docker.socket|<ip/host address>:<port>]
| --secure-connection [<ip/host address>:<port>]]
[--binary_units | --decimal_units] [--timeout TIMEOUT]
[--containers CONTAINERS [CONTAINERS ...]] [--present]
[--threads THREADS] [--cpu WARN:CRIT]
[--memory WARN:CRIT:UNITS] [--status STATUS] [--health]
[--uptime WARN:CRIT] [--image-age WARN:CRIT]
[--version]
[--insecure-registries INSECURE_REGISTRIES [INSECURE_REGISTRIES ...]]
[--restarts WARN:CRIT] [--no-ok] [--no-performance]
[-V]
Check docker containers.
optional arguments:
-h, --help show this help message and exit
--connection [/<path to>/docker.socket|<ip/host address>:<port>]
Where to find docker daemon socket. (default:
/var/run/docker.sock)
--secure-connection [<ip/host address>:<port>]
Where to find TLS protected docker daemon socket.
--binary_units Use a base of 1024 when doing calculations of KB, MB,
GB, & TB (This is default)
--decimal_units Use a base of 1000 when doing calculations of KB, MB,
GB, & TB
--timeout TIMEOUT Connection timeout in seconds. (default: 10.0)
--containers CONTAINERS [CONTAINERS ...]
One or more RegEx that match the names of the
container(s) to check. If omitted all containers are
checked. (default: ['all'])
--present Modifies --containers so that each RegEx must match at
least one container.
--threads THREADS This + 1 is the maximum number of concurent
threads/network connections. (default: 10)
--cpu WARN:CRIT Check cpu usage percentage taking into account any
limits. Valid values are 0 - 100.
--memory WARN:CRIT:UNITS
Check memory usage taking into account any limits.
Valid values for units are %,B,KB,MB,GB.
--status STATUS Desired container status (running, exited, etc).
--health Check container's health check status
--uptime WARN:CRIT Minimum container uptime in seconds. Use when
infrequent crashes are tolerated.
--image-age WARN:CRIT
Maximum image age in days.
--version Check if the running images are the same version as
those in the registry. Useful for finding stale
images. Does not support login.
--insecure-registries INSECURE_REGISTRIES [INSECURE_REGISTRIES ...]
List of registries to connect to with http(no TLS).
Useful when using "--version" with images from
insecure registries.
--restarts WARN:CRIT Container restart thresholds.
--no-ok Make output terse suppressing OK messages. If all
checks are OK return a single OK.
--no-performance Suppress performance data. Reduces output when
performance data is not being used.
-V show program's version number and exit
UNKNOWN: Cannot access docker socket file. User ID=0, socket file=/var/run/docker.sock
Oczywiście istnieje możliwość dodatkowego tworzenia skryptów monitorujących które z poziomu hosta będą odpalane z użyciem nrpe. Wszystko zależne od tego co dokładnie chcesz by było monitorowane.