Precompute - added note

author: Pjotr Prins 2023-10-31 12:26:20 +0100
committer: Pjotr Prins 2023-10-31 12:26:20 +0100
commit: fc21bbdf482ce2d8a4ab4d0ea13317925bbe0abe (patch)
tree: 31d9a7290efa66f36030ca0745b210b8a9ae012f
parent: 2d7ae1406091f4b756894e950b6d44479562283d (diff)
download: gn-gemtext-fc21bbdf482ce2d8a4ab4d0ea13317925bbe0abe.tar.gz
1 files changed, 2 insertions, 1 deletions
diff --git a/topics/systems/mariadb/precompute-mapping-input-data.gmi b/topics/systems/mariadb/precompute-mapping-input-data.gmi
index 0c068e9..080d737 100644
--- a/topics/systems/mariadb/precompute-mapping-input-data.gmi
+++ b/topics/systems/mariadb/precompute-mapping-input-data.gmi
@@ -324,7 +324,7 @@ On the computing host (or client) we should track the following:
 * Hostname of run (this)
 * File path (this)
 * Hash on output data (for validation)
-* DB hostnames: Successfully updated DB table for these servers
+* array of rec: DB hostnames + time stamps: Successfully updated DB table for these servers
 
 The logic is that if the DB table was changed we should recompute the hash on inputs.
 Note the ProbeSetData table is the largest at 200G including indices.
@@ -341,6 +341,7 @@ We want to track compute so we can distribute running the algorithms across serv
 This implies the compute machines have to be able to query the DB in some way.
 Basically a machine has a 'runner' that checks the DB for updates and fetches phenotypes and genotypes.
 A run is started and on completion the DB is notified and updated.
+Note that a runner can not be parallel on a single results directory, so one runner per target output directory.
 
 We can have different runners, one for local machine, one for PBS and one for remotes.
author	Pjotr Prins	2023-10-31 12:26:20 +0100
committer	Pjotr Prins	2023-10-31 12:26:20 +0100
commit	fc21bbdf482ce2d8a4ab4d0ea13317925bbe0abe (patch)
tree	31d9a7290efa66f36030ca0745b210b8a9ae012f
parent	2d7ae1406091f4b756894e950b6d44479562283d (diff)
download	gn-gemtext-fc21bbdf482ce2d8a4ab4d0ea13317925bbe0abe.tar.gz