summary refs log tree commit diff
diff options
context:
space:
mode:
authorFrederick Muriuki Muriithi2025-08-20 11:15:04 -0500
committerPjotr Prins2026-01-05 11:12:10 +0100
commitf0b03dc7f38dd26f446e5be06238f3c76e8bdb7a (patch)
tree8bd417835cee790a84d4974918f30a3d5ffe34c9
parent3fdc44bd04112d4d3f7c921a8cd624499119803b (diff)
downloadgn-gemtext-f0b03dc7f38dd26f446e5be06238f3c76e8bdb7a.tar.gz
Failing Services' Startup: New issue.
-rw-r--r--issues/CI-CD/failing-services-startup.gmi79
1 files changed, 79 insertions, 0 deletions
diff --git a/issues/CI-CD/failing-services-startup.gmi b/issues/CI-CD/failing-services-startup.gmi
new file mode 100644
index 0000000..122a78e
--- /dev/null
+++ b/issues/CI-CD/failing-services-startup.gmi
@@ -0,0 +1,79 @@
+# Failing Services' Startup
+
+## Tags
+
+* type: bug
+* status: open, in progress
+* priority: high
+* assigned: fredm
+* interested: pjotrp, bonfacem, aruni
+* keywords: deployment, CI, CD
+
+## Description
+
+On rebuild of the CI/CD container with guix channel pinned at commit `34453b97005ff86355399df89c8827c57839d9c7`, some services fail to start and the error messages we get are as follows:
+
+```
+2025-08-20 16:05:20 Backtrace:
+2025-08-20 16:05:20            6 (primitive-load "/gnu/store/xbxd2zihw9dssrhips925gri0yn?")
+2025-08-20 16:05:20 In ice-9/eval.scm:
+2025-08-20 16:05:20    191:35  5 (_ _)
+2025-08-20 16:05:20 In gnu/build/linux-container.scm:
+2025-08-20 16:05:20     368:8  4 (call-with-temporary-directory #<procedure 7f014aa3a3f0?>)
+2025-08-20 16:05:20    476:16  3 (_ "/tmp/guix-directory.VWRNbv")
+2025-08-20 16:05:20      62:6  2 (call-with-clean-exit #<procedure 7f014aa1de80 at gnu/b?>)
+2025-08-20 16:05:20    321:20  1 (_)
+2025-08-20 16:05:20 In guix/build/syscalls.scm:
+2025-08-20 16:05:20   1231:10  0 (_ 268566528)
+2025-08-20 16:05:20 
+2025-08-20 16:05:20 guix/build/syscalls.scm:1231:10: In procedure unshare: 268566528: Invalid argument
+2025-08-20 16:05:20 Backtrace:
+2025-08-20 16:05:20            4 (primitive-load "/gnu/store/xbxd2zihw9dssrhips925gri0yn?")
+2025-08-20 16:05:20 In ice-9/eval.scm:
+2025-08-20 16:05:20    191:35  3 (_ #f)
+2025-08-20 16:05:20 In gnu/build/linux-container.scm:
+2025-08-20 16:05:20     368:8  2 (call-with-temporary-directory #<procedure 7f014aa3a3f0?>)
+2025-08-20 16:05:20     485:7  1 (_ "/tmp/guix-directory.VWRNbv")
+2025-08-20 16:05:20 In unknown file:
+2025-08-20 16:05:20            0 (waitpid #f #<undefined>)
+2025-08-20 16:05:20 
+2025-08-20 16:05:20 ERROR: In procedure waitpid:
+2025-08-20 16:05:20 Wrong type (expecting exact integer): #f
+```
+
+The services that fail are:
+
+* genenetwork3: consistently
+* genenetwork2: consistently
+* gn-auth: intermittently
+
+After digging further into this issue, and I think I have the beginnings of an idea of why the issue is comming up. Looking at:
+
+=> https://codeberg.org/guix/guix/src/commit/34453b97005ff86355399df89c8827c57839d9c7/guix/build/syscalls.scm#L1218-L1233
+
+We see the documentation says:
+
+> Note that CLONE_NEWUSER requires that the calling process be single-threaded,
+> which is possible if and only if libgc is running a single marker thread; this
+> can be achieved by setting the GC_MARKERS environment variable to 1.  If the
+> calling process is multi-threaded, this throws to 'system-error' with EINVAL.
+
+and looking at the error we are getting:
+
+```
+⋮
+2025-08-20 15:17:38 guix/build/syscalls.scm:1231:10: In procedure unshare: 268566528: Invalid argument
+⋮
+```
+
+Now, looking at
+=>https://codeberg.org/guix/guix/src/commit/34453b97005ff86355399df89c8827c57839d9c7/gnu/build/linux-container.scm#L321 where `unshare` is called
+
+
+we could come to the conclusion that, perhaps the calling process for `unshare` for "genenetwork3" and "genenetwork2" is consistently multi-threaded, leading to the error above.
+
+It might also explain why the /gn-auth/ service will **sometimes** throw the same error when the container is restarted, but other times, it'll just start with no error.
+
+I (currently) have no idea why the calling process would be multi-threaded.
+
+Or maybe, I'm overthinking this whole thing.