diff options
author | Pjotr Prins | 2022-09-02 07:22:08 -0500 |
---|---|---|
committer | Pjotr Prins | 2022-09-02 07:22:08 -0500 |
commit | 795ba2ffb5ed5150004785768b8b8c479b24b197 (patch) | |
tree | d6b3ab39bc3f9c44b713aee44892bade2f57bf9b | |
parent | c0e505604a56a86d9c9e78f0823de95e0bcfb40b (diff) | |
download | gn-gemtext-795ba2ffb5ed5150004785768b8b8c479b24b197.tar.gz |
Collapsed P2 resolving
-rw-r--r-- | topics/systems/migrate-p2.gmi | 12 | ||||
-rw-r--r-- | topics/systems/orchestration.gmi | 31 |
2 files changed, 43 insertions, 0 deletions
diff --git a/topics/systems/migrate-p2.gmi b/topics/systems/migrate-p2.gmi new file mode 100644 index 0000000..c7fcb90 --- /dev/null +++ b/topics/systems/migrate-p2.gmi @@ -0,0 +1,12 @@ +* Penguin2 crash + +This week the boot partition of P2 crashed. We have a few lessons here, not least having a fallback for all services ;) + +* Tasks + +- [ ] setup space.uthsc.edu for GN2 development +- [ ] update DNS to tux02 128.169.4.52 and space 128.169.5.175 +- [ ] move CI/CD to tux02 + + +* Notes diff --git a/topics/systems/orchestration.gmi b/topics/systems/orchestration.gmi new file mode 100644 index 0000000..336dbbd --- /dev/null +++ b/topics/systems/orchestration.gmi @@ -0,0 +1,31 @@ +* Orchestration and fallbacks + +After the Penguin2 crash in Aug. 2022 it has become increasingly clear how hard it is to deploy GeneNetwork. GNU Guix helps a great deal with dependencies, but it does not handle orchestration between machines/services well. Also we need to look at the future. + +What is GN today in terms of services + + 1. Main GN2 server (Python, 20+ processes, 3+ instances: depends on all below) + 2. Matching GN3 server and REST endpoint (Python: less dependencies) + 3. Mariadb + 4. redis + 5. virtuoso + 6. GN-proxy (Racket, authentication handler: redis, mariadb) + 7. Alias proxy (Racket, gene aliases wikidata) + 8. Jupyter R and Julia notebooks + 9. BNW server (Octave) +10. UCSC browser +11. GN1 instances (older python, 12 instances in principle, 2 running today) +12. Access to HPC for GEMMA (coming) +13. Backup services +14. monitoring services + +I am still missing a few! All run by a man and his diligent dog. + +For the future the orchestration needs to be more robust and resilient. This means: + + 1. A fallback for every service on a separate machine + 2. Improved privacy protection for (future) human data + 3. Separate servers serving different data sources + 4. Partial synchronization between data sources + +The only way we *can* scale is by adding machines. But the system is not yet ready for that. Also getting rid of monolithic primary databases in favor of files helps synchronization. |