summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPjotr Prins2023-10-31 12:26:20 +0100
committerPjotr Prins2023-10-31 12:26:20 +0100
commitfc21bbdf482ce2d8a4ab4d0ea13317925bbe0abe (patch)
tree31d9a7290efa66f36030ca0745b210b8a9ae012f
parent2d7ae1406091f4b756894e950b6d44479562283d (diff)
downloadgn-gemtext-fc21bbdf482ce2d8a4ab4d0ea13317925bbe0abe.tar.gz
Precompute - added note
-rw-r--r--topics/systems/mariadb/precompute-mapping-input-data.gmi3
1 files changed, 2 insertions, 1 deletions
diff --git a/topics/systems/mariadb/precompute-mapping-input-data.gmi b/topics/systems/mariadb/precompute-mapping-input-data.gmi
index 0c068e9..080d737 100644
--- a/topics/systems/mariadb/precompute-mapping-input-data.gmi
+++ b/topics/systems/mariadb/precompute-mapping-input-data.gmi
@@ -324,7 +324,7 @@ On the computing host (or client) we should track the following:
* Hostname of run (this)
* File path (this)
* Hash on output data (for validation)
-* DB hostnames: Successfully updated DB table for these servers
+* array of rec: DB hostnames + time stamps: Successfully updated DB table for these servers
The logic is that if the DB table was changed we should recompute the hash on inputs.
Note the ProbeSetData table is the largest at 200G including indices.
@@ -341,6 +341,7 @@ We want to track compute so we can distribute running the algorithms across serv
This implies the compute machines have to be able to query the DB in some way.
Basically a machine has a 'runner' that checks the DB for updates and fetches phenotypes and genotypes.
A run is started and on completion the DB is notified and updated.
+Note that a runner can not be parallel on a single results directory, so one runner per target output directory.
We can have different runners, one for local machine, one for PBS and one for remotes.