summary refs log tree commit diff
path: root/issues/CI-CD/failing-services-startup.gmi
diff options
context:
space:
mode:
Diffstat (limited to 'issues/CI-CD/failing-services-startup.gmi')
-rw-r--r--issues/CI-CD/failing-services-startup.gmi236
1 files changed, 236 insertions, 0 deletions
diff --git a/issues/CI-CD/failing-services-startup.gmi b/issues/CI-CD/failing-services-startup.gmi
new file mode 100644
index 0000000..751e61c
--- /dev/null
+++ b/issues/CI-CD/failing-services-startup.gmi
@@ -0,0 +1,236 @@
+# Failing Services' Startup
+
+## Tags
+
+* type: bug
+* status: closed, completed
+* priority: high
+* assigned: fredm, bonfacem
+* interested: pjotrp, bonfacem, aruni
+* keywords: deployment, CI, CD
+
+## Description
+
+Upgrading guix to `34453b97005ff86355399df89c8827c57839d9c7` for CI/CD fails with:
+
+```
+2025-08-20 16:05:20 Backtrace:
+2025-08-20 16:05:20            6 (primitive-load "/gnu/store/xbxd2zihw9dssrhips925gri0yn?")
+2025-08-20 16:05:20 In ice-9/eval.scm:
+2025-08-20 16:05:20    191:35  5 (_ _)
+2025-08-20 16:05:20 In gnu/build/linux-container.scm:
+2025-08-20 16:05:20     368:8  4 (call-with-temporary-directory #<procedure 7f014aa3a3f0?>)
+2025-08-20 16:05:20    476:16  3 (_ "/tmp/guix-directory.VWRNbv")
+2025-08-20 16:05:20      62:6  2 (call-with-clean-exit #<procedure 7f014aa1de80 at gnu/b?>)
+2025-08-20 16:05:20    321:20  1 (_)
+2025-08-20 16:05:20 In guix/build/syscalls.scm:
+2025-08-20 16:05:20   1231:10  0 (_ 268566528)
+2025-08-20 16:05:20 
+2025-08-20 16:05:20 guix/build/syscalls.scm:1231:10: In procedure unshare: 268566528: Invalid argument
+2025-08-20 16:05:20 Backtrace:
+2025-08-20 16:05:20            4 (primitive-load "/gnu/store/xbxd2zihw9dssrhips925gri0yn?")
+2025-08-20 16:05:20 In ice-9/eval.scm:
+2025-08-20 16:05:20    191:35  3 (_ #f)
+2025-08-20 16:05:20 In gnu/build/linux-container.scm:
+2025-08-20 16:05:20     368:8  2 (call-with-temporary-directory #<procedure 7f014aa3a3f0?>)
+2025-08-20 16:05:20     485:7  1 (_ "/tmp/guix-directory.VWRNbv")
+2025-08-20 16:05:20 In unknown file:
+2025-08-20 16:05:20            0 (waitpid #f #<undefined>)
+2025-08-20 16:05:20 
+2025-08-20 16:05:20 ERROR: In procedure waitpid:
+2025-08-20 16:05:20 Wrong type (expecting exact integer): #f
+```
+
+Failing services:
+
+* genenetwork3: consistently
+* genenetwork2: consistently
+* gn-auth: intermittently
+
+## Troubleshooting Notes
+
+### Unable to run genenetwork2 in a shell container with the "-C" flag
+
+With the following channels:
+
+```
+$ guix describe
+Generation 3    Aug 28 2025 03:56:44    (current)
+  gn-bioinformatics cffafde
+    repository URL: file:///home/bonfacem/guix-bioinformatics/
+    branch: master
+    commit: cffafde125f3e711418d3ebb62eacd48a3efa8cf
+  guix-forge 3c8dc85
+    repository URL: https://git.genenetwork.org/guix-forge/
+    branch: main
+    commit: 3c8dc85a584c98bc90088ec1c85933d4d10e7383
+  guix-past b14d7f9
+    repository URL: https://codeberg.org/guix-science/guix-past
+    branch: master
+    commit: b14d7f997ae8eec788a7c16a7252460cba3aaef8
+  guix 34453b9
+    repository URL: https://codeberg.org/guix/guix
+    branch: master
+    commit: 34453b97005ff86355399df89c8827c57839d9c7
+```
+
+Running:
+
+```
+$ guix shell -C genenetwork2
+```
+
+Produces:
+
+```
+guix shell: error: unshare: 268566528: Invalid argument
+Backtrace:
+          16 (primitive-load "/export3/local/home/bonfacem/.guix-ext…")
+In guix/ui.scm:
+   2399:7 15 (run-guix . _)
+  2362:10 14 (run-guix-command _ . _)
+In ice-9/boot-9.scm:
+  1752:10 13 (with-exception-handler _ _ #:unwind? _ # _)
+In guix/status.scm:
+    842:4 12 (call-with-status-report _ _)
+In guix/store.scm:
+    703:3 11 (_)
+In ice-9/boot-9.scm:
+  1752:10 10 (with-exception-handler _ _ #:unwind? _ # _)
+In guix/store.scm:
+   690:37  9 (thunk)
+   1331:8  8 (call-with-build-handler _ _)
+   1331:8  7 (call-with-build-handler #<procedure 7fc86bb50de0 at g…> …)
+In guix/scripts/environment.scm:
+  1205:11  6 (proc _)
+In guix/store.scm:
+  2212:25  5 (run-with-store #<store-connection 256.100 7fc87a46d820> …)
+In guix/scripts/environment.scm:
+    911:8  4 (_ _)
+In gnu/build/linux-container.scm:
+    485:7  3 (call-with-container _ _ #:namespaces _ #:host-uids _ # …)
+In unknown file:
+           2 (waitpid #f #<undefined>)
+In ice-9/boot-9.scm:
+  1685:16  1 (raise-exception _ #:continuable? _)
+  1685:16  0 (raise-exception _ #:continuable? _)
+
+ice-9/boot-9.scm:1685:16: In procedure raise-exception:
+Wrong type (expecting exact integer): #f
+```
+
+This is fixed by increasing the value of respawn-delay (default is 0.5s) to 5s.
+
+
+### Unable to write to a temporary directory and issues with running git inside the g-exp
+
+Stack trace:
+```
+2025-09-03 12:23:32 In ice-9/eval.scm:
+2025-09-03 12:23:32    191:35  3 (_ #f)
+2025-09-03 12:23:32 In gnu/build/linux-container.scm:
+2025-09-03 12:23:32     368:8  2 (call-with-temporary-directory #<procedure 7f012241d3f0?>)
+2025-09-03 12:23:32     485:7  1 (_ "/tmp/guix-directory.Bl6jtx")
+2025-09-03 12:23:32 In unknown file:
+2025-09-03 12:23:32            0 (waitpid #f #<undefined>)
+2025-09-03 12:23:32
+
+```
+
+Cryptic message.   Running the g-exps as a program shows:
+
+```
+Receiving objects: 100% (698/698), 16.18 MiB | 30.29 MiB/s, done.
+Resolving deltas: 100% (49/49), done.
+==================================================
+error: cannot run less: No such file or directory
+fatal: unable to execute pager 'less'
+Backtrace:
+           5 (primitive-load "/gnu/store/c9bvy90s5mglp6xdfkc1s4qkzj8?")
+In ice-9/eval.scm:
+    619:8  4 (_ #f)
+In ice-9/boot-9.scm:
+    142:2  3 (dynamic-wind #<procedure 7fa954b25880 at ice-9/eval.s?> ?)
+    142:2  2 (dynamic-wind #<procedure 7fa94b7970c0 at ice-9/eval.s?> ?)
+In ice-9/eval.scm:
+    619:8  1 (_ #(#(#<directory (guile-user) 7fa954b03c80>)))
+In guix/build/utils.scm:
+    822:6  0 (invoke "git" "log" "--max-count" "1")
+
+guix/build/utils.scm:822:6: In procedure invoke:
+ERROR:
+  1. &invoke-error:
+      program: "git"
+      arguments: ("log" "--max-count" "1")
+      exit-status: 128
+      term-signal: #f
+      stop-signal: #f
+```
+
+Fixed by adding "less" to the with-packages form and setting:
+
+```
+(setenv "TERM" "xterm-256color")
+
+```
+
+### gn-auth: sqlite3.OperationalError: unable to open database file
+
+Despite having all file perms correctly set with 0644, we see:
+
+```
+Traceback (most recent call last):
+  File "/gnu/store/ag1m9bv22iwm3sq87xly35y138l6kzd7-profile/lib/python3.11/site-packages/flask/app.py", line 917, in full_dispatch_request
+    rv = self.dispatch_request()
+         ^^^^^^^^^^^^^^^^^^^^^^^
+  File "/gnu/store/ag1m9bv22iwm3sq87xly35y138l6kzd7-profile/lib/python3.11/site-packages/flask/app.py", line 902, in dispatch_request
+    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
+           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+  File "/export/data/repositories/gn-auth/gn_auth/auth/authentication/oauth2/views.py", line 102, in authorise
+    return with_db_connection(__authorise__)
+           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+  File "/export/data/repositories/gn-auth/gn_auth/auth/db/sqlite3.py", line 63, in with_db_connection
+    return func(conn)
+           ^^^^^^^^^^
+  File "/export/data/repositories/gn-auth/gn_auth/auth/authentication/oauth2/views.py", line 90, in __authorise__
+    return server.create_authorization_response(request=request, grant_user=user)
+           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+  File "/gnu/store/ag1m9bv22iwm3sq87xly35y138l6kzd7-profile/lib/python3.11/site-packages/authlib/oauth2/rfc6749/authorization_server.py", line 297, in create_authorization_response
+    args = grant.create_authorization_response(redirect_uri, grant_user)
+           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+  File "/export/data/repositories/gn-auth/gn_auth/auth/authentication/oauth2/grants/authorisation_code_grant.py", line 31, in create_authorization_response
+    response = super().create_authorization_response(
+               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+  File "/gnu/store/ag1m9bv22iwm3sq87xly35y138l6kzd7-profile/lib/python3.11/site-packages/authlib/oauth2/rfc6749/grants/authorization_code.py", line 158, in create_authorization_response
+    self.save_authorization_code(code, self.request)
+  File "/export/data/repositories/gn-auth/gn_auth/auth/authentication/oauth2/grants/authorisation_code_grant.py", line 45, in save_authorization_code
+    return __save_authorization_code__(
+           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+  File "/export/data/repositories/gn-auth/gn_auth/auth/authentication/oauth2/grants/authorisation_code_grant.py", line 106, in __save_authorization_code__
+    return with_db_connection(lambda conn: save_authorisation_code(conn, code))
+           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+  File "/export/data/repositories/gn-auth/gn_auth/auth/db/sqlite3.py", line 63, in with_db_connection
+    return func(conn)
+           ^^^^^^^^^^
+  File "/export/data/repositories/gn-auth/gn_auth/auth/authentication/oauth2/grants/authorisation_code_grant.py", line 106, in <lambda>
+    return with_db_connection(lambda conn: save_authorisation_code(conn, code))
+                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+  File "/export/data/repositories/gn-auth/gn_auth/auth/authentication/oauth2/models/authorization_code.py", line 92, in save_authorisation_code
+    cursor.execute(
+sqlite3.OperationalError: unable to open database file
+```
+
+Fixed above by correctly mapping:
+
+```
+-                                                (source auth-db-path)
++                                                (source (dirname auth-db-path))
+```
+
+in the relevant g-exp, and making sure that the parent directory is set to #o775 (rwx for both user/group).
+
+## Also See
+
+=> https://issues.guix.gnu.org/78356 Broken system and home containers
+=> https://codeberg.org/guix/guix/src/commit/34453b97005ff86355399df89c8827c57839d9c7/guix/build/syscalls.scm#L1218-L1233 How "unshare" is defined
+=> https://codeberg.org/guix/guix/src/commit/34453b97005ff86355399df89c8827c57839d9c7/gnu/build/linux-container.scm#L321 Where `unshare` is called