summaryrefslogtreecommitdiff
path: root/issues/gemma
diff options
context:
space:
mode:
authorPjotr Prins2021-11-14 16:15:19 -0600
committerPjotr Prins2021-11-14 16:15:26 -0600
commit3787440fb80f3d0e0f6834aaf9fda07d6dd9b2e1 (patch)
treecd135adbcf6d2696d08d2d9e9b43e985dce8f5e1 /issues/gemma
parentb1b742dd53786817c2ba87e6716e0b21c257de6d (diff)
downloadgn-gemtext-3787440fb80f3d0e0f6834aaf9fda07d6dd9b2e1.tar.gz
gemma-wrapper transactions
Diffstat (limited to 'issues/gemma')
-rw-r--r--issues/gemma/gemma-wrapper-has-incomplete-files.gmi22
1 files changed, 22 insertions, 0 deletions
diff --git a/issues/gemma/gemma-wrapper-has-incomplete-files.gmi b/issues/gemma/gemma-wrapper-has-incomplete-files.gmi
new file mode 100644
index 0000000..4bea71d
--- /dev/null
+++ b/issues/gemma/gemma-wrapper-has-incomplete-files.gmi
@@ -0,0 +1,22 @@
+# gemma-wrapper has incomplete files
+
+Gemma wrapper caches files - but it can happen a cached file is incomplete and never updated again. The problem appears when GNU parallel is invoked and hits an error. The task here is to make gemma-wrapper transactional.
+
+## Tags
+
+* assigned: pjotrp, zachs
+
+## Tasks
+
+* [ ] parse parallel job log for failed tasks and remove the output files.
+* [ ] create a (global) lock file for gemma-wrapper
+
+## Info
+
+GNU parallel can fail, but does not tell how individual processes did. Need to check if it can return a thread (number). If not we have the option of checking the GEMMA status file and/or see if the output file is complete (by counting number of lines).
+
+Turns out GNU parallel can keep track of jobs in a job log - and even rerun the ones missing. The last we don't need because we are using a cache. But we can use the log file to remove any incomplete output files!
+
+There is another parallel issue (pun intended) where gemma-wrapper is invoked twice for the same job. This is quite possible when people get impatient waiting for a first job to finish.
+
+One solution is to write a lock file using the inputs as a hash. The lock file can contain a PID and we can check if that is still alive. I should do the same for sheepdog locks(!)