summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPjotr Prins2025-03-06 07:36:50 +0100
committerPjotr Prins2025-03-06 07:36:50 +0100
commitdad3bcdc2e1d6bf2f67f36c6bb87e342b3cdfcbb (patch)
tree48edc1548f6b97e23cccf5609a95906c0b12c6c2
parent9577323e37451faf1a400f35713ef3e1fc396164 (diff)
downloadgn-gemtext-dad3bcdc2e1d6bf2f67f36c6bb87e342b3cdfcbb.tar.gz
Fixing the database from backup
-rw-r--r--issues/systems/tux04-disk-issues.gmi44
-rw-r--r--topics/systems/backups-with-borg.gmi16
-rw-r--r--topics/systems/mariadb/mariadb.gmi5
3 files changed, 64 insertions, 1 deletions
diff --git a/issues/systems/tux04-disk-issues.gmi b/issues/systems/tux04-disk-issues.gmi
index 130c2b9..8df8863 100644
--- a/issues/systems/tux04-disk-issues.gmi
+++ b/issues/systems/tux04-disk-issues.gmi
@@ -202,6 +202,50 @@ See also
Tux04 will require open heart 'disk controller' surgery and some severe testing before we move back. We'll also look at tux05-8 to see if they have similar problems.
+## Recovery
+
+According to the logs tux04 started showing serious errors on March 2nd - when I introduced sanitizing the mariadb backup:
+
+```
+Mar 02 05:00:42 tux04 kernel: I/O error, dev sde, sector 2071078320 op 0x0:(READ) flags 0x80700 phys_seg 16 prio class 2
+Mar 02 05:00:58 tux04 kernel: I/O error, dev sde, sector 2083650928 op 0x0:(READ) flags 0x80700 phys_seg 59 prio class 2
+...
+```
+
+The log started on Feb 23 when we had our last reboot. It probably is a good idea to turn on persistent logging! Anyway, it is likely files were fine until March 2nd. Similarly the mariadb logs also show
+
+```
+2025-03-02 6:53:52 489007 [ERROR] mariadbd: Index for table './db_webqtl/ProbeSetData.MYI' is corrupt; try to repair it
+2025-03-02 6:53:52 489007 [ERROR] db_webqtl.ProbeSetData: Can't read key from filepos: 2269659136
+```
+
+So, if we can restore a backup from March 1st we should be reasonably confident it is sane.
+
+First is to backup the existing database(!) Next restore the new DB by changing the DB location (symlink in /var/lib/mysql as well as check /etc/mysql/mariadb.cnf).
+
+When upgrading it is an idea to switch on these in mariadb.cnf
+
+```
+# forcing recovery with these two lines:
+innodb_force_recovery=3
+innodb_purge_threads=0
+```
+
+Make sure to disable (and restart) once it is up and running!
+
+So the steps are:
+
+* [X] install updated guix version of mariadb in /usr/local/guix-profiles (don't use Debian!!)
+* [X] repair borg backup
+* [X] Stop old mariadb (on new host tux02)
+* [X] backup old mariadb database
+* [X] restore 'sane' version of DB from borg March 1st
+* [X] point to new DB in /var/lib/mysql and cnf file
+* [X] update systemd settings
+* [X] start mariadb new version with recovery setting in cnf
+* [X] check logs
+* [X] once running revert on recovery setting in cnf and restart
+
## Other servers
```
diff --git a/topics/systems/backups-with-borg.gmi b/topics/systems/backups-with-borg.gmi
index 5cdb2a3..d5bfd1b 100644
--- a/topics/systems/backups-with-borg.gmi
+++ b/topics/systems/backups-with-borg.gmi
@@ -200,3 +200,19 @@ Our production server runs databases and file stores that need to be backed up t
Once backups work it is useful to copy them to a remote server, so when the machine stops functioning we have another chance at recovery. See
=> ./backup-drops.gmi
+
+# Recovery
+
+With tux04 we ran into a problem where all disks were getting corrupted(!) Probably due to the RAID controller, but we still need to figure that one out.
+
+Anyway, we have to assume the DB is corrupt. Files are corrupt AND the backups are corrupt. Borg backup has checksums which you can
+
+```
+borg check repo
+```
+
+it has a --repair switch which we needed to remove some faults in the backup itself:
+
+```
+borg check --repair repo
+```
diff --git a/topics/systems/mariadb/mariadb.gmi b/topics/systems/mariadb/mariadb.gmi
index ae0ab19..c575bf4 100644
--- a/topics/systems/mariadb/mariadb.gmi
+++ b/topics/systems/mariadb/mariadb.gmi
@@ -60,4 +60,7 @@ Stop the running mariadb-guix.service. Restore the latest backup archive and ove
=> https://www.borgbackup.org/ Borg
=> https://borgbackup.readthedocs.io/en/stable/ Borg documentation
-#
+# Uprade mariadb
+
+It is wise to upgrade mariadb once in a while. In a disaster recovery it is better to move forward in versions too.
+Before upgrading make sure there is a decent backup of the current setup.