update readme

This commit is contained in:
Fredrik Eriksson 2024-07-14 12:37:46 +02:00
parent 343d629ff6
commit 8abf7d89f6
Signed by: feffe
GPG Key ID: E6B5580B853D322B
2 changed files with 94 additions and 2 deletions

View File

@ -1,3 +1,95 @@
# sysalert
Generic OnFailure= and OnSuccess= handler for systemd
Utility to send notifications from systemd using OnFailure= and OnSuccess= hooks.
## Purpose
This tool is intended to be used to send notifications when a systemd service fails. It is installed by setting `sysalert-failure@%n.service` and `sysalert-success@%n.service` as OnFailure= and OnSuccess=-handlers in the systemd service files.
The primary purpose is to keep track of services triggerd by timers and paths and similar, but it
can be used to montitor any systemd service.
## Features and inner workings
- ignore X failures before sending notification
- do not send repeated notifications of the same problem
- send recovery notifications
- flexible alert mechanism
On a high level sysalert works like this:
When sysalert-failure is triggered the triggering service exit status, invocation ID and a timestamp
is saved to a sqlite database. Based on previous results and configuration in `/etc/sysalert.ini` a
notification is sent using the configured alert method.
When sysalert-success is triggered sysalert will send a notification about service
recovery (if enabled) as well as clear the sqlite database from any failures from the triggering service.
## Installation
[Build and install](https://packaging.python.org/en/latest/tutorials/packaging-projects/) the python
package and install the configuration file and systemd services:
```
cp config/sysalert.ini /etc/
cp systemd/sysalert-failure@.service systemd/sysalert-success@.service /etc/systemd/system/
mkdir /etc/systemd/system/sysalert-.service.d
cp systemd/overrides/sysalert-.service.d.conf /etc/systemd/system/sysalert-.service.d/10-sysalert.conf
systemctl daemon-reload
```
Once everything is installed you can set `sysalert-failure@%n.service` and `sysalert-success@%n.service` as OnFailure= and OnSuccess=-handlers in any service unit to get an email notification on failure.
It is also possible to set this system-wide by creating
`/etc/systemd/system/service.d/10-sysalert.conf` like so:
```
[Unit]
OnFailure=sysalert-failure@%n.service
OnSuccess=sysalert-success@%n.service
```
**WARNING:** setting a system-wide handler like this will override any OnFailure= or OnSuccess= set
in service files, and modifying dependencies for sysalert may cause the system to fail at boot. Only
do this if you're sure it works on your system or are ready to troubleshoot boot failures.
There is also a [Gentoo ebuild](https://gitea.fulh.ax/feffe/feffe-portage-overlay/src/branch/master/sys-apps/sysalert)
I made for my own convinience, but beware as the ebuild installs sysalert as a system-wide handler
as described above.
## Configuration
sysalert searches /etc/sysalert.ini for configuration; see example configuration in repo.
Note that by default the sysalert-services depend on network.target, depending on your alert-methods
you may need to override this.
## Alert methods
At the moment the only implemented alert method is 'sysalert.email' which uses smtp to send an email
about service problems. Currently the email content is not templated, but it does include the
journal log for the failed service as well as other nice-to-know information.
sysalert uses dynamic imports to import the alert methods. sysalert.email is a python module
implemented in this package, but it can be any python module on your system that implements the
`success()` and `failure()` methods.
### `success()` and `failure()`
Any module that implements these methods can be used as an alert-method. These methods takes three arguments:
- **service_name** - name of the service
- **failures** - list of dicts containing data about previous (and current) failures. the list is
sorted on time with the first failure first and latest failure at the end. Currently the dicts include:
- `service_result`
- `exit_code`
- `exit_status`
- `invocation_id`
- `timestamp`
- `alert_method`
- **config** - a dict containing all key-values defined in the configuration section for the
alert-method. For example 'sysalert.email'-section for 'sysalert.email' alert method.
## Stuff to fix
This was a weekend project and is not very polished. Here are a few things that could probably be
improved:
- Fix hardcoded paths (config-file and database location)
- Implement command line tool (running `sysalert` manually should make it possible to update/clear
database entries, maybe reconfigure and see alert status)
- Proper packaging and maybe publish in pip
- Implement more handlers (maybe `sysalert.syslog`)
- Find a method to detect if a failed service was triggered manually or by a timer/path/other
service etc. Would be nice to be able to set this as default only on services triggered by
timers...

View File

@ -9,7 +9,7 @@ authors = [ {name = "Fredrik Eriksson", email = "sysalert@fulh.ax"} ]
dependencies = [
"systemd-python"
]
description = "generic OnFailure= and OnSuccess= handler for systemd"
description = "Generic OnFailure= and OnSuccess= handler for systemd"
readme = "README.md"
license = { file = "LICENSE" }
keywords = [ "systemd" ]