update readme
This commit is contained in:
		
							
								
								
									
										94
									
								
								README.md
									
									
									
									
									
								
							
							
						
						
									
										94
									
								
								README.md
									
									
									
									
									
								
							| @ -1,3 +1,95 @@ | |||||||
| # sysalert | # sysalert | ||||||
|  | Generic OnFailure= and OnSuccess= handler for systemd | ||||||
|  |  | ||||||
| Utility to send notifications from systemd using OnFailure= and OnSuccess= hooks. | ## Purpose | ||||||
|  | This tool is intended to be used to send notifications when a systemd service fails. It is installed by setting `sysalert-failure@%n.service` and `sysalert-success@%n.service` as OnFailure= and OnSuccess=-handlers in the systemd service files. | ||||||
|  |  | ||||||
|  | The primary purpose is to keep track of services triggerd by timers and paths and similar, but it | ||||||
|  | can be used to montitor any systemd service. | ||||||
|  |  | ||||||
|  | ## Features and inner workings | ||||||
|  |  - ignore X failures before sending notification | ||||||
|  |  - do not send repeated notifications of the same problem | ||||||
|  |  - send recovery notifications | ||||||
|  |  - flexible alert mechanism | ||||||
|  |  | ||||||
|  | On a high level sysalert works like this: | ||||||
|  |  | ||||||
|  | When sysalert-failure is triggered the triggering service exit status, invocation ID and a timestamp | ||||||
|  | is saved to a sqlite database. Based on previous results and configuration in `/etc/sysalert.ini` a | ||||||
|  | notification is sent using the configured alert method. | ||||||
|  |  | ||||||
|  | When sysalert-success is triggered sysalert will send a notification about service | ||||||
|  | recovery (if enabled) as well as clear the sqlite database from any failures from the triggering service. | ||||||
|  |  | ||||||
|  |  | ||||||
|  | ## Installation | ||||||
|  | [Build and install](https://packaging.python.org/en/latest/tutorials/packaging-projects/) the python | ||||||
|  | package and install the configuration file and systemd services: | ||||||
|  | ``` | ||||||
|  | cp config/sysalert.ini /etc/ | ||||||
|  | cp systemd/sysalert-failure@.service systemd/sysalert-success@.service /etc/systemd/system/ | ||||||
|  | mkdir /etc/systemd/system/sysalert-.service.d | ||||||
|  | cp systemd/overrides/sysalert-.service.d.conf /etc/systemd/system/sysalert-.service.d/10-sysalert.conf | ||||||
|  | systemctl daemon-reload | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | Once everything is installed you can set `sysalert-failure@%n.service` and `sysalert-success@%n.service` as OnFailure= and OnSuccess=-handlers in any service unit to get an email notification on failure. | ||||||
|  | It is also possible to set this system-wide by creating | ||||||
|  | `/etc/systemd/system/service.d/10-sysalert.conf` like so: | ||||||
|  | ``` | ||||||
|  | [Unit] | ||||||
|  | OnFailure=sysalert-failure@%n.service | ||||||
|  | OnSuccess=sysalert-success@%n.service | ||||||
|  | ``` | ||||||
|  | **WARNING:** setting a system-wide handler like this will override any OnFailure= or OnSuccess= set | ||||||
|  | in service files, and modifying dependencies for sysalert may cause the system to fail at boot. Only | ||||||
|  | do this if you're sure it works on your system or are ready to troubleshoot boot failures. | ||||||
|  |  | ||||||
|  |  | ||||||
|  | There is also a [Gentoo ebuild](https://gitea.fulh.ax/feffe/feffe-portage-overlay/src/branch/master/sys-apps/sysalert) | ||||||
|  | I made for my own convinience, but beware as the ebuild installs sysalert as a system-wide handler | ||||||
|  | as described above. | ||||||
|  |  | ||||||
|  | ## Configuration | ||||||
|  | sysalert searches /etc/sysalert.ini for configuration; see example configuration in repo. | ||||||
|  |  | ||||||
|  | Note that by default the sysalert-services depend on network.target, depending on your alert-methods | ||||||
|  | you may need to override this. | ||||||
|  |  | ||||||
|  | ## Alert methods | ||||||
|  | At the moment the only implemented alert method is 'sysalert.email' which uses smtp to send an email | ||||||
|  | about service problems. Currently the email content is not templated, but it does include the | ||||||
|  | journal log for the failed service as well as other nice-to-know information. | ||||||
|  |  | ||||||
|  | sysalert uses dynamic imports to import the alert methods. sysalert.email is a python module | ||||||
|  | implemented in this package, but it can be any python module on your system that implements the | ||||||
|  | `success()` and `failure()` methods. | ||||||
|  |  | ||||||
|  | ### `success()` and `failure()` | ||||||
|  | Any module that implements these methods can be used as an alert-method. These methods takes three arguments: | ||||||
|  |  | ||||||
|  |  - **service_name** - name of the service | ||||||
|  |  - **failures** - list of dicts containing data about previous (and current) failures. the list is | ||||||
|  |    sorted on time with the first failure first and latest failure at the end. Currently the dicts include: | ||||||
|  |      - `service_result` | ||||||
|  |      - `exit_code` | ||||||
|  |      - `exit_status` | ||||||
|  |      - `invocation_id` | ||||||
|  |      - `timestamp` | ||||||
|  |      - `alert_method` | ||||||
|  |  | ||||||
|  |  - **config** - a dict containing all key-values defined in the configuration section for the | ||||||
|  |    alert-method. For example 'sysalert.email'-section for 'sysalert.email' alert method. | ||||||
|  |  | ||||||
|  | ## Stuff to fix | ||||||
|  | This was a weekend project and is not very polished. Here are a few things that could probably be | ||||||
|  | improved: | ||||||
|  |   - Fix hardcoded paths (config-file and database location) | ||||||
|  |   - Implement command line tool (running `sysalert` manually should make it possible to update/clear | ||||||
|  |     database entries, maybe reconfigure and see alert status) | ||||||
|  |   - Proper packaging and maybe publish in pip | ||||||
|  |   - Implement more handlers (maybe `sysalert.syslog`) | ||||||
|  |   - Find a method to detect if a failed service was triggered manually or by a timer/path/other | ||||||
|  |     service etc. Would be nice to be able to set this as default only on services triggered by | ||||||
|  |     timers... | ||||||
|  | |||||||
| @ -9,7 +9,7 @@ authors = [ {name = "Fredrik Eriksson", email = "sysalert@fulh.ax"} ] | |||||||
| dependencies = [ | dependencies = [ | ||||||
|   "systemd-python" |   "systemd-python" | ||||||
| ] | ] | ||||||
| description = "generic OnFailure= and OnSuccess= handler for systemd" | description = "Generic OnFailure= and OnSuccess= handler for systemd" | ||||||
| readme = "README.md" | readme = "README.md" | ||||||
| license = { file = "LICENSE" } | license = { file = "LICENSE" } | ||||||
| keywords = [ "systemd" ] | keywords = [ "systemd" ] | ||||||
|  | |||||||
		Reference in New Issue
	
	Block a user