diff options
Diffstat (limited to 'topics/systems/mariadb/ProbeSetData.gmi')
-rw-r--r-- | topics/systems/mariadb/ProbeSetData.gmi | 66 |
1 files changed, 57 insertions, 9 deletions
diff --git a/topics/systems/mariadb/ProbeSetData.gmi b/topics/systems/mariadb/ProbeSetData.gmi index 9fb8306..5feb384 100644 --- a/topics/systems/mariadb/ProbeSetData.gmi +++ b/topics/systems/mariadb/ProbeSetData.gmi @@ -1,6 +1,11 @@ # ProbeSetData -This is by far the largest table (~100Gb). I need to add disk space to be able to host it on the NVME. +This is by far the largest table (~100Gb). I need to add disk space to be able to host it on the NVME and move stuff around. Final move is GN2 code and we have over 400Gb free. + +I disabled these and they need to be restored: + +* [ ] binary log +* [ ] backup script Prototocol from @@ -8,25 +13,54 @@ Prototocol from Every table update has to follow the template: +## Reboot Tux01 and Tux02 + +=> ../reboot-tux01-tux02.gmi + + +## Table upgrade + ### Check recent backups * [X] ascertain there is a backup -* [ ] copy original files +* [X] copy original files ### Make temporary table If you have enough space, you can create a copy of the actual table and do the work on that: +Original: + +``` +| ProbeSetData | CREATE TABLE 'ProbeSetData' ( + 'Id' int(10) unsigned NOT NULL DEFAULT 0, + 'StrainId' int(20) NOT NULL, + 'value' float NOT NULL, + UNIQUE KEY 'DataId' ('Id','StrainId'), + KEY 'strainid' ('StrainId') +) ENGINE=MyISAM DEFAULT CHARSET=latin1 | +``` + +New: + ``` -CREATE TABLE mytest AS SELECT * FROM ProbeSetData; +DROP table mytest; +CREATE table mytest( + Id INT unsigned NOT NULL DEFAULT 0, + StrainId MediumINT unsigned NOT NULL, + value float NOT NULL, + primary key(Id,StrainId), + key (StrainId)) CHARACTER SET utf8mb4; ``` -Then you can change the column as desired: +And fill it killing quite a few birds with one stone :). You may want to disable the binary logger before running: ``` -ALTER TABLE tbl_name MODIFY COLUMN col_name BIGINT AUTO_INCREMENT; +INSERT INTO mytest (Id,StrainId,value) SELECT Id,StrainId,value FROM ProbeSetData; ``` +(it may be faster to use mysqldump for this!). + Once the process is done, you can rename the tables: ``` @@ -40,7 +74,7 @@ RENAME TABLE tbl_name TO new_tbl_name, tbl_name2 TO ProbeSetData; SHOW CREATE TABLE ProbeSetData; ``` -A really simple table +This is a really simple table ``` | ProbeSetData | CREATE TABLE 'ProbeSetData' ( @@ -76,6 +110,18 @@ MariaDB [db_webqtl]> select count(*) from ProbeSetData; ``` +This parses the full table (MYISAM version) + +``` +MariaDB [db_webqtl]> select count(distinct id) from ProbeSetData; ++--------------------+ +| count(distinct id) | ++--------------------+ +| 48119047 | ++--------------------+ +1 row in set (4 min 40.035 sec) +``` + Do this also on Penguin2. ``` @@ -104,9 +150,9 @@ MariaDB [db_webqtl]> select max(Id) from ProbeSetData; +----------+ ``` -For StrainID MediumInt is a better fit with unsigned range is 0 to 16777215 and Id should be a INT (instead of BIGINT for the time being). That will make the table+indices smaller and faster. +For StrainID MediumInt is a better fit with unsigned range is 0 to 16777215 and Id should be a INT (instead of BIGINT for the time being). That will make the table+indices smaller and faster. Note that MediumInt does not save RAM, just disk space. But that can impact IO performance by 25%. -When a value overflows MariaDB will throw an error. So even if we grow out of values we can quickly resize the table in, say, 10 years. That is, if we are still using SQL. +When a value overflows MariaDB will throw an error. Therefore it is safe to pick the smallest possible size. So even if we grow out of values we can quickly resize the table in, say, 10 years. That is, if we are still using SQL. ### Check GN1,2,3 code for use of table @@ -131,9 +177,11 @@ Page of GN2 is the mapping page, e.g. ``` ALTER TABLE ProbeSetData - ADD PRIMARY KEY(col1,col2); + ADD PRIMARY KEY(Id,StrainId); ``` +(no quotes for column names) + ### Create indices ``` |