diff options
-rw-r--r-- | issues/database-not-responding.gmi | 43 | ||||
-rw-r--r-- | issues/systems/octopus.gmi | 11 | ||||
-rw-r--r-- | topics/octopus/lizardfs/README.gmi | 7 | ||||
-rw-r--r-- | topics/systems/shepherd.gmi | 15 |
4 files changed, 72 insertions, 4 deletions
diff --git a/issues/database-not-responding.gmi b/issues/database-not-responding.gmi index 8928d08..8cb5656 100644 --- a/issues/database-not-responding.gmi +++ b/issues/database-not-responding.gmi @@ -294,6 +294,49 @@ MariaDB [db_webqtl]> SHOW VARIABLES LIKE '%timeout%'; If you set wait_timeout and interactive_timeout values they get reset after a while. +# Recent crash + +``` +| Max_used_connections | 1030 | +| Threads_connected | 1030 | + +show open tables where in_use > 1; + ++-----------+----------------+--------+-------------+ +| Database | Table | In_use | Name_locked | ++-----------+----------------+--------+-------------+ +| db_webqtl | ProbeFreeze | 182 | 0 | +| db_webqtl | ProbeSet | 3 | 0 | +| db_webqtl | ProbeSetXRef | 3 | 0 | +| db_webqtl | PublishFreeze | 7 | 0 | +| db_webqtl | Species | 94 | 0 | +| db_webqtl | Tissue | 94 | 0 | +| db_webqtl | ProbeSetFreeze | 3 | 0 | +| db_webqtl | InbredSet | 108 | 0 | +| db_webqtl | GenoFreeze | 7 | 0 | ++-----------+----------------+--------+-------------+ +``` + +``` +FLUSH NO_WRITE_TO_BINLOG TABLES + +Waiting for table flush SELECT confidentiality, AuthorisedUsers FROM ProbeSetFreeze WHERE Name = 'EL_BXDCDScWAT_0216' 0.000 + +Waiting for table flush SELECT DISTINCT Tissue.Name FROM ProbeFreeze,ProbeSetFreeze, InbredSet, Tissue, Species WHERE Speci 0.000 +``` + +Another one to check next time is + +``` +SHOW ENGINE INNODB STATUS +``` + +and + +``` +mariabackup --backup --kill-long-query-type=SELECT --kill-long-queries-timeout=120 +``` + # Monitor connections Use the general log as described in diff --git a/issues/systems/octopus.gmi b/issues/systems/octopus.gmi index 1420865..bc421eb 100644 --- a/issues/systems/octopus.gmi +++ b/issues/systems/octopus.gmi @@ -26,3 +26,14 @@ default via 172.23.16.1 dev ens1f0np0 172.23.16.0/21 dev eno1 proto kernel scope link src 172.23.18.68 172.23.16.0/21 dev eno2 proto kernel scope link src 172.23.17.134 ``` + +# Current topology + +``` +ip a +ip route +``` + +- Octopus01 uses eno1 172.23.18.188/21 gateway 172.23.16.1 (eno1: Link is up at 1000 Mbps) +- Octopus02 uses eno1 172.23.17.63/21 gateway 172.23.16.1 (eno1: Link is up at 1000 Mbps) + 172.23.x.x diff --git a/topics/octopus/lizardfs/README.gmi b/topics/octopus/lizardfs/README.gmi index ef8d1aa..078a628 100644 --- a/topics/octopus/lizardfs/README.gmi +++ b/topics/octopus/lizardfs/README.gmi @@ -1,6 +1,6 @@ # Information about lizardfs, and some usage suggestions -On the octopus cluster the lizardfs head node is on octopus01, with disks being added mainly from the other nodes. SSDs are added to the lizardfs-chunkserver.service systemd service and HDDs added to the lizardfs-chunkserver-hdd.service. The storage pool is available on all nodes at /lizardfs, with the default storage option of "slow", which corresponds to two copies of the data, both on HDDs. +On the octopus cluster the lizardfs head node is on octopus01, with disks being added mainly from the other nodes. SSDs are added to the lizardfs-chunkserver.service systemd service and SDDs added to the lizardfs-chunkserver-hdd.service. The storage pool is available on all nodes at /lizardfs, with the default storage option of "slow", which corresponds to two copies of the data, both on SDDs. ## Interacting with lizardfs @@ -71,11 +71,12 @@ Chunks deletion state: slow 68 15 2081 27598 201022 20 - - - - - fast 12603 720 1880 5377 23902 - - - - - - 2ssd 7984 - - - - - - - - - - - ``` To query how the individual disks are filling up and if there are any errors: +List all disks + ``` lizardfs-admin list-disks octopus01 9421 | less ``` @@ -103,7 +104,7 @@ It should be noted that any goal using erasure_coding is incredibly slow to writ ``` # CHUNKS_SOFT_DEL_LIMIT = 100 # CHUNKS_HARD_DEL_LIMIT = 250 -# CHUNKS_WRITE_REP_LIMIT = 20 +# CHUNKS_WRITE_REP_LIMIT = 20 # CHUNKS_READ_REP_LIMIT = 100 ``` diff --git a/topics/systems/shepherd.gmi b/topics/systems/shepherd.gmi index d67d9d6..9cf1ed4 100644 --- a/topics/systems/shepherd.gmi +++ b/topics/systems/shepherd.gmi @@ -8,6 +8,14 @@ * status: wip * priority: unclear +## Quick overview + +Shepherd runs in systemd as a shepherd user + +``` +systemctl status user-shepherd.service +``` + ## Description On Debian based systems we run shepherd as a shepherd user. The service gets started up through systemd. @@ -17,9 +25,11 @@ repository. The process for deploying the services: +``` symlink shepherd-services/shepherd to $HOME/.config/shepherd symlink shepherd-services/cron to $home/.config/cron symlink shepherd-services/*sh to $HOME +``` When shepherd starts up it should start all the services. So currently that's bnw, gitea, ipfs, power, rn6app, singlecell and the mcron @@ -28,7 +38,10 @@ services, gitea-dump and pubmed. To use shepherd's herd command the command is 'sudo -u shepherd /home/shepherd/.guix-profile/bin/herd status'. -: /home/shepherd/.guix-profile/bin/herd status +``` +su shepherd +/home/shepherd/.guix-profile/bin/herd status +``` Adding a bash alias, such as "alias herd-herd='sudo -u shepherd /home/shepherd/.guix-profile/bin/herd'", will make it easier to |