Monitor Configuration
Monitors are defined in (by default) monitors.ini
. The monitor is named
by its [section]
heading. If you create a [defaults]
section, the
values are used as defaults for all the other monitors. Each monitor’s
configuration will override the values from the default.
Contents
Common options
These options are common to all monitor types.
- type
- Type
string
- Required
true
the type of the monitor; one of those in the list below.
- runon
- Type
string
- Required
false
- Default
none
a hostname on which the monitor should run. If not set, always runs. You can use this to share one config file among many hosts. (The value which is compared to is that returned by Python’s
socket.gethostname()
.)
- depend
- Type
comma-separated list of string
- Required
false
- Default
none
the monitors on which this one depends. This monitor will run after those, unless one of them fails or is skipped, in which case this one will also skip. A skip does not trigger an alerter.
- tolerance
- Type
integer
- Required
false
- Default
1
the number of times a monitor can fail before it enters the failed state. Handy for things which intermittently fail, such as unreliable links. The number of times the monitor has actually failed, minus this number, is its “Virtual Failure Count”. See also the limit option on Alerters.
- urgent
- Type
boolean
- Required
false
- Default
true
if this monitor is “urgent” or not. Non-urgent monitors do not trigger urgent alerters (e.g. BulkSMS)
- gap
- Type
integer
- Required
false
- Default
0
the number of seconds this monitor should allow to pass before polling. Use it to make a monitor poll only once an hour (
3600
), for example. Setting this value lower than theinterval
will have no effect, and the monitor will run every loop like normal.Some monitors default to a higher value when it doesn’t make sense to run their check too frequently because the underlying data will not change that often or quickly, such as pkgaudit. You can override their default to a lower value as required.
Hint
Monitors which are in the failed state will poll every loop, regardless of this setting, in order to detect recovery as quickly as possible
- remote_alert
- Type
boolean
- Required
false
- Default
false
set to true to have this monitor’s alerting handled by a remote instance instead of the local one. If you’re using the remote feature, this is a good candidate to put in the
[defaults]
.
- recover_command
- Type
string
- Required
false
- Default
none
a command to execute once when this monitor enters the failed state. For example, it could attempt to restart a service.
- recovered_command
- Type
string
- Required
false
- Default
none
a command to execute once when this monitor returns to the OK state. For example, it could restart a service which was affected by the failure of what this monitor checks.
- notify
- Type
boolean
- Required
false
- Default
true
if this monitor should alert at all.
- group
- Type
string
- Required
false
- Default
default
the group the monitor belongs to. Alerters and Loggers will only fire for monitors which appear in their groups.
- failure_doc
- Type
string
- Required
false
- Default
none
information to include in alerts on failure (e.g. a URL to a runbook)
- gps
- Type
string
- Required
no, unless you want to use the html logger’s map
comma-separated latitude and longitude of this monitor
Monitors
Note
The type
of the monitor is the first word in its heading.
- apcupsd - APC UPS status
- arlo_camera - Arlo camera battery level
- command - run an external command
- compound - combine monitors
- diskspace - free disk space
- dns - resolve record
- eximqueue - Exim queue size
- fail - alawys fails
- filestat - file size and age
- hass_sensor - Home Automation Sensors
- host - ping a host
- http - fetch and verify a URL
- loadavg - load average
- memory - free memory percent
- null - always passes
- ping - ping a host
- pkgaudit - FreeBSD pkg audit
- portaudit - FreeBSD port audit
- process - running process
- rc - FreeBSD rc service
- ring_doorbell - Ring doorbell battery
- service - Windows Service
- svc - daemontools service
- swap - available swap space
- systemd-unit - systemd unit check
- tcp - open TCP port
- tls_expiry - TLS cert expiration
- unifi_failover - USG failover WAN status
- unifi_watchdog - USG failover watchdog
- unix_service - generic UNIX service