summaryrefslogtreecommitdiff
path: root/tasks/machine-room.gmi
diff options
context:
space:
mode:
Diffstat (limited to 'tasks/machine-room.gmi')
-rw-r--r--tasks/machine-room.gmi71
1 files changed, 44 insertions, 27 deletions
diff --git a/tasks/machine-room.gmi b/tasks/machine-room.gmi
index f6c7737..77f7b8e 100644
--- a/tasks/machine-room.gmi
+++ b/tasks/machine-room.gmi
@@ -1,36 +1,45 @@
# Machine room tasks
-## Tags
+# Tags
* assigned: pjotrp
* priority: medium
* type: system administration
* keywords: system administration, octopus, gateway, tux02, tux01, tux03
-## Tasks
-
-### UTHSC
-
-* [ ] describe machines with Rick Stripes
-* [ ] get bacchus back on line
-* [ ] fix www.genenetwork.org and gn2.genenetwork.org https
+# Tasks
+
+## GN
+
+* [ ] penguin2 has 90TB of space we can use on NFS/backups
+* [ ] Script to replace reaper with GEMMA
+* [ ] Transfer nervenet.org to dnsimple
+* [+] Trait vectors for Johannes
+* [X] grub on tux04
+* [ ] nft on tux04
+* [ ] !!Organize pluto, update Julia and add apps to GN menu Jupyter notebooks
+* [+] !!Xusheng jumpshiny services
+* [ ] Fix apps and create system containers for herd services - see issues/systems/apps
+* [ ] Slurm+ravanan on production for GEMMA speedup
+* [ ] Embed R/qtl2 (Alex)
+* [ ] Hoot in GN2 (Andrew)
* [ ] tux02 certbot failing (manual now)
-* [ ] get data from summer211.uthsc.edu (access machine room)
-* [ ] VPN access and FoUT
-* [ ] penguin2 has 32TB of space we can use on NFS/backups
-
-Network:
-* [ ] Octopus: wire up machines so they talk with each other over fiber
+## Octopus:
-Lambda:
-
-* [ ] remote access? (with Erik)
- * [X] get BMC password
+* [X] Fix Tux05 badblocks on /dev/sdb2 1050624 47925247 46874624 22.4G Linux filesystem
+ - see add-boot-partition
+* [+] Copy linux partition on tux04, tux05, tux02 and test reboot
+* [ ] !!Ceph on Tuxes
+* [ ] Centralized user management system
+* [ ] Monitor nodes
+* [ ] Check machines so they talk with each other over fiber
-Backups & storage:
+## Backups & storage:
-* [_] data warehousing
+* [ ] Create and check backups of tux04 etc etc.
+* [ ] set up zero to backup tux02 and report to redis
+* [ ] reintroduce borg-borg on zero
* [+] run sheepdog as root: redis password error; introduce SHEEPDOG_CONF
* [ ] tux01 has unused 4TB spinning disk
* [ ] tux02 has unused 2x4TB spinning disks and 2TB nvme /dev/nvme0n1 on adapter
@@ -39,22 +48,23 @@ Backups & storage:
fwupdmgr get-devices
fwupdmgr update
The previously problematic Samsung 980 Pro was basically using the 3B2QGXA7, and now Samsung has introduced a new 5B2QGXA7 firmware to fix the problem. The problem mainly affects the 2TB version of the 980 Pro
-* [ ] Check backups of etc etc.
Security:
* [ ] Limit idrac access
-* [X] space server out-of-band access
-### Spice
+## Spice
-* [ ] Run GN off balg01
+* [ ] Add 2nd boot partition on balg01
* [ ] Add firewall test to sheepdog
-* [ ] Convert balg02 to Guix server
-* [ ] VM for student team
-### Done
+## Done
+* [X] describe machines with Rick Stripes
+* [X] get bacchus back on line
+* [X] fix www.genenetwork.org and gn2.genenetwork.org https
+* [-] get data from summer211.uthsc.edu (access machine room)
+* [X] VPN access and FoUT
* [X] lambda: get fiber working
* [X] lambda: add to Octopus HPC
* [X] lambda: racked up and runs
@@ -82,3 +92,10 @@ Security:
* [X] tux07 has no fiber
* [X] tux08 has no fiber
* [X] tux09 has no fiber
+### Lambda
+* [X] remote access? (with Erik)
+ * [X] get BMC password
+* [X] space server out-of-band access
+### Spice
+* [X] Run GN off balg01
+* [X] Convert balg02 to Guix server