summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--topics/systems/mariadb/ProbeSetData.gmi66
-rw-r--r--topics/systems/octopus.gmi15
-rw-r--r--topics/systems/reboot-tux01-tux02.gmi40
3 files changed, 110 insertions, 11 deletions
diff --git a/topics/systems/mariadb/ProbeSetData.gmi b/topics/systems/mariadb/ProbeSetData.gmi
index 9fb8306..5feb384 100644
--- a/topics/systems/mariadb/ProbeSetData.gmi
+++ b/topics/systems/mariadb/ProbeSetData.gmi
@@ -1,6 +1,11 @@
# ProbeSetData
-This is by far the largest table (~100Gb). I need to add disk space to be able to host it on the NVME.
+This is by far the largest table (~100Gb). I need to add disk space to be able to host it on the NVME and move stuff around. Final move is GN2 code and we have over 400Gb free.
+
+I disabled these and they need to be restored:
+
+* [ ] binary log
+* [ ] backup script
Prototocol from
@@ -8,25 +13,54 @@ Prototocol from
Every table update has to follow the template:
+## Reboot Tux01 and Tux02
+
+=> ../reboot-tux01-tux02.gmi
+
+
+## Table upgrade
+
### Check recent backups
* [X] ascertain there is a backup
-* [ ] copy original files
+* [X] copy original files
### Make temporary table
If you have enough space, you can create a copy of the actual table and do the work on that:
+Original:
+
+```
+| ProbeSetData | CREATE TABLE 'ProbeSetData' (
+ 'Id' int(10) unsigned NOT NULL DEFAULT 0,
+ 'StrainId' int(20) NOT NULL,
+ 'value' float NOT NULL,
+ UNIQUE KEY 'DataId' ('Id','StrainId'),
+ KEY 'strainid' ('StrainId')
+) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+```
+
+New:
+
```
-CREATE TABLE mytest AS SELECT * FROM ProbeSetData;
+DROP table mytest;
+CREATE table mytest(
+ Id INT unsigned NOT NULL DEFAULT 0,
+ StrainId MediumINT unsigned NOT NULL,
+ value float NOT NULL,
+ primary key(Id,StrainId),
+ key (StrainId)) CHARACTER SET utf8mb4;
```
-Then you can change the column as desired:
+And fill it killing quite a few birds with one stone :). You may want to disable the binary logger before running:
```
-ALTER TABLE tbl_name MODIFY COLUMN col_name BIGINT AUTO_INCREMENT;
+INSERT INTO mytest (Id,StrainId,value) SELECT Id,StrainId,value FROM ProbeSetData;
```
+(it may be faster to use mysqldump for this!).
+
Once the process is done, you can rename the tables:
```
@@ -40,7 +74,7 @@ RENAME TABLE tbl_name TO new_tbl_name, tbl_name2 TO ProbeSetData;
SHOW CREATE TABLE ProbeSetData;
```
-A really simple table
+This is a really simple table
```
| ProbeSetData | CREATE TABLE 'ProbeSetData' (
@@ -76,6 +110,18 @@ MariaDB [db_webqtl]> select count(*) from ProbeSetData;
```
+This parses the full table (MYISAM version)
+
+```
+MariaDB [db_webqtl]> select count(distinct id) from ProbeSetData;
++--------------------+
+| count(distinct id) |
++--------------------+
+| 48119047 |
++--------------------+
+1 row in set (4 min 40.035 sec)
+```
+
Do this also on Penguin2.
```
@@ -104,9 +150,9 @@ MariaDB [db_webqtl]> select max(Id) from ProbeSetData;
+----------+
```
-For StrainID MediumInt is a better fit with unsigned range is 0 to 16777215 and Id should be a INT (instead of BIGINT for the time being). That will make the table+indices smaller and faster.
+For StrainID MediumInt is a better fit with unsigned range is 0 to 16777215 and Id should be a INT (instead of BIGINT for the time being). That will make the table+indices smaller and faster. Note that MediumInt does not save RAM, just disk space. But that can impact IO performance by 25%.
-When a value overflows MariaDB will throw an error. So even if we grow out of values we can quickly resize the table in, say, 10 years. That is, if we are still using SQL.
+When a value overflows MariaDB will throw an error. Therefore it is safe to pick the smallest possible size. So even if we grow out of values we can quickly resize the table in, say, 10 years. That is, if we are still using SQL.
### Check GN1,2,3 code for use of table
@@ -131,9 +177,11 @@ Page of GN2 is the mapping page, e.g.
```
ALTER TABLE ProbeSetData
- ADD PRIMARY KEY(col1,col2);
+ ADD PRIMARY KEY(Id,StrainId);
```
+(no quotes for column names)
+
### Create indices
```
diff --git a/topics/systems/octopus.gmi b/topics/systems/octopus.gmi
index e97d0e1..0929a65 100644
--- a/topics/systems/octopus.gmi
+++ b/topics/systems/octopus.gmi
@@ -1,13 +1,24 @@
# Octopus sysmaintenance
-## Tags
+# Tags
* assigned: pjotrp, efraimf, erikg
-## Tasks
+# Tasks
* [X] install sheepdog
* [X] run borg backup
* [X] propagate backup to rabbit
* [X] fix redis updates - use rev tunnel
* [X] check other dirs
+
+# Info
+
+Intermediate routing on Octopus08
+
+```
+default via 172.23.16.1 dev ens1f0np0
+172.23.16.0/21 dev ens1f0np0 proto kernel scope link src 172.23.17.24
+172.23.16.0/21 dev eno1 proto kernel scope link src 172.23.18.68
+172.23.16.0/21 dev eno2 proto kernel scope link src 172.23.17.134
+```
diff --git a/topics/systems/reboot-tux01-tux02.gmi b/topics/systems/reboot-tux01-tux02.gmi
new file mode 100644
index 0000000..e3dbc91
--- /dev/null
+++ b/topics/systems/reboot-tux01-tux02.gmi
@@ -0,0 +1,40 @@
+# Rebooting the GN production machine(s)
+
+I needed to add the hard disks in the BIOS to make them visible - one of the annoying aspects of these Dell machines. First on Tux02 I cheched the borg backups to see if we have a recent copy of MariaDB, GN2 etc. The DB is from 2 days ago and the genotypes of GN2 are a week old (because of a permission problem). I'll add a copy by hand of both - an opportunity to test the new 10Gbs router.
+
+# Tasks
+
+Before rebooting
+
+On Tux02:
+
+* [X] Check backups of DB and services
+* [X] Copy trees between machines
+
+On both:
+
+* [X] Check network interface definitions (what happens on reboot)
+* [X] Check IPMI access - should get serial login
+
+
+# Info
+
+## Routing
+
+On tux02 eno2d1 is the 10Gbs network interface. Unfortunately I can't get it to connect at 10Gbs with Tux01 because the latter is using that port for the outside world.
+
+Playing with 10Gbs on Tux01 sent the hardware in a tail spin, what to think of this solution on
+
+```
+bnxt_en 0000:01:00.1 (unnamed net_device) (uninitialized): Error (timeout: 500015) msg {0x0 0x0} len:0
+
+Solution was to power down the server(s) and *remove* power cords for 5 minutes.
+```
+
+=> https://www.dell.com/community/PowerEdge-Hardware-General/Critical-network-bnxt-en-module-crashes-on-14G-servers/td-p/6031769
+
+The Linux kernel shows some fixes that are not on Tux01 yet
+
+=> https://lkml.org/lkml/2021/2/17/970
+
+In our case a simple reboot worked, fortunately.