|Pjotr Prins 9305286088||7 个月前|
-- mode: org --
If you have a herd of machines sheepdog helps you keep sane. Sheepdog does not replace the likes of Nagios or Zabbix though I am happy not to run those systems becasuse they target (complex) deployments that have people watching these systems.
NOTICE: Sheepdog software is in early stages of development. YMMV.
Sheepdog is much simpler because it does not require watching systems. Sheepdog gives a cursory idea of health of machines out there. It notifies you if things go really wrong, for example when backups fail. I used to have scripts for that that would mail/text me. But that was all a bit ad hoc and I got tired of maintaining them and I got tired of repeating notifications…
Also, sheepdog is a real helper for the shepherd. It is designed to use logic programming. And allows questions like:
What services showed interruptions in the last month on low RAM machines that also runs a specific version of nginx?
This means storing state of machines in a database that gets updated by messages (SQLite). It means a good message broker (Redis). It means that every time you write a monitoring service, you'll have to write a receiver to turn it into a datastructure something like miniKanren can solve. A key feature of sheepdog is to make creating such small reporter/receiver tools really easy. Sheepdog is pluggable - that means you can write a plugin for a monitor and you can write a plugin for the reporter/receiver which has hooks in the sheepdog daemon that runs on a server somewhere.
Visualisations are less important - though it is quite possible to build them.
In other words, much in the spirit of GNU Guix and GNU Shepherd, sheepdog is a different type of systems monitor: a minimalistic system that is hackable and can work out of the box for Guix systems and are really easy to extend.
Many tools run like CRON jobs and send output to some file. What happens when the job fails? You would like a notification. We can write a script that tests for the state of the backup job and puts it in the message queue when there is a problem:
export stamp=$(date +%A-%Y%m%d-%H:%M:%S) sheepdog -H localhost:645 -e "borg create --stats /export/backup/borg::ipfs-$stamp /export/ipfs"
we can tell to only send a message on error with
--on-error. On the
client we can tell it to inform
The receiver sits on the server and is a default plugin that is part of the sheepdog-queue-daemon. Receiving textual output is a standard message that is added in the database by the sheepdog-message plugin. If you want to change behaviour on the server you can write your own plugins.
There are hundreds of possible system indicators. The sheepdog-daemon can run on a system and monitors resources by default. Start the monitor daemon with
sheepdog-daemon -c server-config.scm
In the configuration you can specify what to watch. All monitors are (configurable) plugins.
The receiver is part of sheepdog-queue-daemon and consists of matching plugins (in fact, monitor and receiver with hooks live in the same source file).
Sheepdog is published under the GPLv3 License. See /pjotrp/sheepdog/src/branch/master/COPYING.