# Fallbacks and backups A revisit to previous work on backups etc. The sheepdog hosts are no longer responding and we should really run sheepdog on a machine that is not physically with the other machines. In time sheepdog should also move away from redis and run in a system container, but that is for later. I did most of the work late 2021 when I wrote: > As a hurricane is barreling towards our machine room in Memphis we are checking our fallbacks and backups for GeneNetwork. For years we have been making backups on Amazon - both S3 and a running virtual machine. The latter was expensive, so I replaced it with a bare metal server which earns itself (if it hadn't been down for months, but that is a different story). As we are introducing an external sheepdog server we may give it a DNS entry as sheepdog.genenetwork.org. See also => /topics/systems/restore-backups Restore Backups ## Tags * type: enhancement * assigned: pjotrp * keywords: systems, fallback, backup, deploy * status: in progress * priority: critical ## Tasks * [X] fix redis queue and sheepdog server * [ ] check backups on tux01 * [ ] backup ratspub, r/shiny, bnw, covid19, hegp, pluto services * [ ] /etc /home/shepherd backups for Octopus * [ ] /etc /home/shepherd /home/git CI-CD GN-QA backups on Tux02 * [ ] Get backups running again on fallback * [ ] fix bacchus large backups * [ ] mount bacchus on HPC ## Backup and restore We are using borg for backing up data. Borg is excellent at deduplication and compression of data and is pretty fast too. Incremental copies work with rsync - so that is fast. To restore the full MariaDB database from a local borg repo takes a few minutes: ``` wrk@epysode:/export/restore_tux01$ time borg extract -v /export2/backup/tux01/borg-tux01::BORG-TUX01-MARIADB-20210829-04:20-Sun real 17m32.498s user 8m49.877s sys 4m25.934s ``` This all contrasts heavily with restoring 300GB from Amazon S3. Next restore the GN2 home dir ``` root@epysode:/# borg extract export2/backup/tux01/borg-genenetwork::TUX01_BORG_GN2_HOME-20210830-04:00-Mon ``` ## Get backups running on fallback Recently epysode was reinstated after hardware failure. I took the opportunity to reinstall the machine. The backups are described in the repo (genenetwork org members have access) => https://github.com/genenetwork/gn-services/blob/master/services/backups.org BACKUPS As epysode was one of the main sheepdog messaging servers I need to reinstate: * [X] scripts for sheepdog * [ ] Check tunnel on tux01 is reinstated * [ ] enable trim * [ ] reinstate monitoring web services * [ ] reinstate daily backups * [ ] CRON * [ ] make sure messaging works through redis * [ ] fix and propagate GN1 backup * [ ] fix and propagate fileserver and git backups * [ ] add GN1 backup * [ ] other backups * [ ] email on fail Tux01 is backed up now. Need to make sure it propagates to * [ ] rabbit * [ ] Tux02 * [ ] balg01 * [ ] bacchus