#+TITLE: Setup of a simple Guix build farm and substitute server #+AUTHOR: Collin J. Doering A few months ago [[https://lists.gnu.org/archive/html/guix-devel/2024-07/msg00033.html][I announced on the guix mailing list]] that there was a new North American based Guix substitute server and build farm, cuirass.genenetwork.org. This article provides further information about how the build farm and substitute server was setup, and how you can do so for yourself or your organization. Having more Guix substitutes servers available improves build diversity (which can be checked with [[https://guix.gnu.org/manual/en/html_node/Invoking-guix-challenge.html][guix challenge]]), as well as substitute availability and improved response times due to server locality. * TODO note inspiration, and in cases direct copy from https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/berlin.scm * Setting up a Minimal Guix Build Farm and Substitute Server Though a Guix build farm and substitute server could be deployed on any distribution, we naturally chose to use Guix itself. There are a variety of components that provide the necessary functionality: - [[https://guix.gnu.org/cuirass/][Cuirass]] :: Watches the the guix channel repository for changes, and manages building of derivations, packages, etc.. - [[https://guix.gnu.org/manual/en/html_node/Invoking-guix-publish.html][guix-publish]] :: Provides substitute archives for consumption by users (indirectly via nginx as a local proxy). - nginx :: Acts as a reverse proxy for Cuirass and guix-publish. - certbot :: Fetches ssl certificates so cuirass and substitutes can be served over https. - anonip :: Anatomizes http logs to preserver user privacy. How each of these components are setup is detailed below, component-by-component. You can see the full source-code for the Tennessee build farm at https://git.genenetwork.org/guix-north-america/. ** Cuirass - building packages *** Define Cuirass Specs #+begin_src scheme (define %cuirass-specs #~(list (specification (name "guix") (priority 0) (build '(channels guix)) (channels %default-channels)))) #+end_src *** Setup Cuirass Service #+begin_src scheme (service cuirass-service-type (cuirass-configuration (host "localhost") (specifications %cuirass-specs))) #+end_src ** Provide Substitutes using Guix Publish #+begin_src scheme (service guix-publish-service-type (guix-publish-configuration (port 3000) (cache "/var/cache/guix/publish"))) #+end_src ** Anonomize IPs (anonip) #+begin_src scheme (define (anonip-service file) (service anonip-service-type (anonip-configuration (input (format #false "/var/run/anonip/~a" file)) (output (format #false "/var/log/anonip/~a" file))) (define %anonip-log-files ;; List of files handled by Anonip '("http.access.log" "https.access.log")) (define (log-file->anonip-service-name file) "Return the name of the Anonip service handling FILE, a log file." (symbol-append 'anonip-/var/log/anonip/ (string->symbol file))) #+end_src ** Certbot #+begin_src scheme (define* (le host #:optional privkey) (string-append "/etc/letsencrypt/live/" host "/" (if privkey "privkey" "fullchain") ".pem")) #+end_src ** Nginx Reverse Proxy *** abc #+begin_src scheme (define publish-robots.txt ;; Try to prevent good-faith crawlers from downloading substitutes. Allow ;; indexing the root—which is expected to be static or cheap—to remain visible ;; in search engine results for, e.g., 'Guix CI'. "\ User-agent: *\r Disallow: /\r Allow: /$\r \r ") #+end_src #+begin_src scheme (define (publish-locations url) "Return the nginx location blocks for 'guix publish' running on URL." (list (nginx-location-configuration (uri "/nix-cache-info") (body (list (string-append "proxy_pass " url "/nix-cache-info;") ;; Cache this file since that's always the first thing we ask ;; for. "proxy_cache static;" "proxy_cache_valid 200 100d;" ; cache hits for a looong time. "proxy_cache_valid any 5m;" ; cache misses/others for 5 min. "proxy_ignore_client_abort on;" ;; We need to hide and ignore the Set-Cookie header to enable ;; caching. "proxy_hide_header Set-Cookie;" "proxy_ignore_headers Set-Cookie;"))) (nginx-location-configuration (uri "/nar/") (body (list (string-append "proxy_pass " url ";") "client_body_buffer_size 256k;" ;; Be more tolerant of delays when fetching a nar. "proxy_read_timeout 60s;" "proxy_send_timeout 60s;" ;; Enable caching for nar files, to avoid reconstructing and ;; recompressing archives. "proxy_cache nar;" "proxy_cache_valid 200 30d;" ; cache hits for 1 month "proxy_cache_valid 504 3m;" ; timeout, when hydra.gnu.org is overloaded "proxy_cache_valid any 1h;" ; cache misses/others for 1h. "proxy_ignore_client_abort on;" ;; Nars are already compressed. "gzip off;" ;; We need to hide and ignore the Set-Cookie header to enable ;; caching. "proxy_hide_header Set-Cookie;" "proxy_ignore_headers Set-Cookie;" ;; Provide a 'content-length' header so that 'guix ;; substitute-binary' knows upfront how much it is downloading. ;; "add_header Content-Length $body_bytes_sent;" ))) (nginx-location-configuration (uri "~ \\.narinfo$") (body (list ;; Since 'guix publish' has its own caching, and since it relies ;; on the atime of cached narinfos to determine whether a ;; narinfo can be removed from the cache, don't do any caching ;; here. (string-append "proxy_pass " url ";") ;; For HTTP pipelining. This has a dramatic impact on ;; performance. "client_body_buffer_size 128k;" ;; Narinfos requests are short, serve many of them on a ;; connection. "keepalive_requests 600;" ;; Do not tolerate slowness of hydra.gnu.org when fetching ;; narinfos: better return 504 quickly than wait forever. "proxy_connect_timeout 10s;" "proxy_read_timeout 10s;" "proxy_send_timeout 10s;" ;; 'guix publish --ttl' produces a 'Cache-Control' header for ;; use by 'guix substitute'. Let it through rather than use ;; nginx's "expire" directive since the expiration time defined ;; by 'guix publish' is the right one. "proxy_pass_header Cache-Control;" "proxy_ignore_client_abort on;" ;; We need to hide and ignore the Set-Cookie header to enable ;; caching. "proxy_hide_header Set-Cookie;" "proxy_ignore_headers Set-Cookie;"))) ;; Content-addressed files served by 'guix publish'. (nginx-location-configuration (uri "/file/") (body (list (string-append "proxy_pass " url ";") "proxy_cache cas;" "proxy_cache_valid 200 200d;" ; cache hits "proxy_cache_valid any 5m;" ; cache misses/others "proxy_ignore_client_abort on;"))) ;; Try to prevent good-faith crawlers from downloading substitutes. (nginx-location-configuration (uri "= /robots.txt") (body (list #~(string-append "try_files " #$(plain-file "robots.txt" publish-robots.txt) " =404;") "root /;"))))) #+end_src #+begin_src scheme (define (balg02-locations publish-url) "Return nginx location blocks with 'guix publish' reachable at PUBLISH-URL." (append (publish-locations publish-url) (list ;; Cuirass. (nginx-location-configuration (uri "/") (body (list "proxy_pass http://localhost:8081;" ;; ;; See ;; ;; . ;; (string-append ;; "add_header Onion-Location http://" %ci-onion ;; "$request_uri;") ))) (nginx-location-configuration (uri "~ ^/admin") (body (list "if ($ssl_client_verify != SUCCESS) { return 403; } proxy_pass http://localhost:8081;"))) (nginx-location-configuration (uri "/static") (body (list "proxy_pass http://localhost:8081;" ;; Cuirass adds a 'Cache-Control' header, honor it. "proxy_cache static;" "proxy_cache_valid 200 2d;" "proxy_cache_valid any 10m;" "proxy_ignore_client_abort on;"))) (nginx-location-configuration (uri "/download") ;Cuirass "build products" (body (list "proxy_pass http://localhost:8081;" "expires 10d;" ;override 'Cache-Control' "proxy_cache static;" "proxy_cache_valid 200 30d;" "proxy_cache_valid any 10m;" "proxy_ignore_client_abort on;"))) (nginx-location-configuration ;certbot (uri "/.well-known") (body (list "root /var/www;")))))) #+end_src #+begin_src scheme (define %publish-url "http://localhost:3000") (define %tls-settings (list ;; Make sure SSL is disabled. "ssl_protocols TLSv1.1 TLSv1.2 TLSv1.3;" ;; Disable weak cipher suites. "ssl_ciphers HIGH:!aNULL:!MD5;" "ssl_prefer_server_ciphers on;" ;; Use our own DH parameters created with: ;; openssl dhparam -out dhparams.pem 2048 ;; as suggested at . "ssl_dhparam /etc/dhparams.pem;")) #+end_src #+begin_src scheme (define %balg02-servers (list ;; Redirect domains that don't explicitly support HTTP (below) to HTTPS. (nginx-server-configuration (listen '("80")) (raw-content (list "return 308 https://$host$request_uri;"))) ;; Domains that still explicitly support plain HTTP. (nginx-server-configuration (listen '("80")) (server-name `("cuirass.genenetwork.org" ;; "~[0-9]$" ; TODO: onion ; ,(regexp-quote %ci-onion) )) (locations (balg02-locations %publish-url)) (raw-content (list "access_log /var/run/anonip/http.access.log;" "proxy_set_header X-Forwarded-Host $host;" "proxy_set_header X-Forwarded-Port $server_port;" "proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;"))) ;; HTTPS servers (nginx-server-configuration (listen '("443 ssl")) (server-name '("cuirass.genenetwork.org")) (ssl-certificate (le "cuirass.genenetwork.org")) (ssl-certificate-key (le "cuirass.genenetwork.org" 'key)) (locations (balg02-locations %publish-url)) (raw-content (append %tls-settings (list "access_log /var/run/anonip/https.access.log;" "proxy_set_header X-Forwarded-Host $host;" "proxy_set_header X-Forwarded-Port $server_port;" "proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;" ;; TODO: ;; For Cuirass admin interface authentication ;; "ssl_client_certificate /etc/ssl-ca/certs/ca.crt;" ;; "ssl_verify_client optional;" )))))) #+end_src #+begin_src scheme (define %extra-content (list "default_type application/octet-stream;" "sendfile on;" (accept-languages) ;; Maximum chunk size to send. Partly this is a workaround for ;; , but also the nginx docs mention that ;; "Without the limit, one fast connection may seize the worker ;; process entirely." ;; "sendfile_max_chunk 1m;" "keepalive_timeout 65;" ;; Use HTTP 1.1 to talk to the backend so we benefit from keep-alive ;; connections and chunked transfer encoding. The latter allows us to ;; make sure we do not cache partial downloads. "proxy_http_version 1.1;" ;; The 'inactive' parameter for caching is not very useful in our ;; case: all that matters is that LRU sweeping happens when 'max_size' ;; is hit. ;; cache for nar files "proxy_cache_path /var/cache/nginx/nar" " levels=2" " inactive=8d" ; inactive keys removed after 8d " keys_zone=nar:4m" ; nar cache meta data: ~32K keys " max_size=10g;" ; total cache data size max ;; cache for content-addressed files "proxy_cache_path /var/cache/nginx/cas" " levels=2" " inactive=180d" ; inactive keys removed after 180d " keys_zone=cas:8m" ; nar cache meta data: ~64K keys " max_size=50g;" ; total cache data size max ;; cache for build logs "proxy_cache_path /var/cache/nginx/logs" " levels=2" " inactive=60d" ; inactive keys removed after 60d " keys_zone=logs:8m" ; narinfo meta data: ~64K keys " max_size=4g;" ; total cache data size max ;; cache for static data "proxy_cache_path /var/cache/nginx/static" " levels=1" " inactive=10d" ; inactive keys removed after 10d " keys_zone=static:1m" ; nar cache meta data: ~8K keys " max_size=200m;" ; total cache data size max ;; If Hydra cannot honor these delays, then something is wrong and ;; we'd better drop the connection and return 504. "proxy_connect_timeout 10s;" "proxy_read_timeout 10s;" "proxy_send_timeout 10s;" ;; Cache timeouts for a little while to avoid increasing pressure. "proxy_cache_valid 504 30s;")) #+end_src #+begin_src scheme (define %nginx-configuration (nginx-configuration (server-blocks %balg02-servers) (server-names-hash-bucket-size 128) (modules (list ;; Module to redirect users to the localized pages of their choice. (file-append nginx-accept-language-module "/etc/nginx/modules/ngx_http_accept_language_module.so"))) (global-directives '((worker_processes . 16) (pcre_jit . on) (events . ((worker_connections . 1024))))) (extra-content (string-join %extra-content "\n")) (shepherd-requirement (map log-file->anonip-service-name %anonip-log-files)))) #+end_src *** Cache activation #+begin_src scheme (define %nginx-cache-activation ;; Make sure /var/cache/nginx exists on the first run. (simple-service 'nginx-/var/cache/nginx activation-service-type (with-imported-modules '((guix build utils)) #~(begin (use-modules (guix build utils)) (mkdir-p "/var/cache/nginx"))))) #+end_src *** Deploy hook #+begin_src scheme (define %nginx-deploy-hook (program-file "nginx-deploy-hook" #~(let ((pid (call-with-input-file "/var/run/nginx/pid" read))) (kill pid SIGHUP)))) #+end_src ** Setup guix-daemon - Allow for substitutes from this server - Adjust guix-daemon configuration (timeouts, # of build accounts, # of cores to use) #+begin_src scheme (define* (guix-daemon-config #:key (max-jobs 5) (cores 4) (build-accounts-to-max-jobs-ratio 4) (authorized-keys '()) (substitute-urls '())) (guix-configuration (substitute-urls substitute-urls) (authorized-keys authorized-keys) ;; We don't want to let builds get stuck for too long, but we still want ;; to allow building things that can take a while (eg. 3h). Adjust as necessary. (max-silent-time 3600) (timeout (* 6 3600)) (log-compression 'gzip) ;be friendly to 'guix publish' users (build-accounts (* build-accounts-to-max-jobs-ratio max-jobs)) (extra-options (list "--max-jobs" (number->string max-jobs) "--cores" (number->string cores) "--gc-keep-derivations")))) #+end_src #+begin_src scheme (modify-services %base-services (guix-service-type config => (guix-daemon-config #:substitute-urls '("https://cuirass.genenetwork.org") #:max-jobs 20 #:cores 4 #:authorized-keys (cons (local-file "../../../.pubkeys/guix/cuirass.genenetwork.org.pub") %default-authorized-guix-keys) #:build-accounts-to-max-jobs-ratio 5))) #+end_src ** Optional *** Onion service * Setup TODO: talk about setup of Tennessee Guix Build Farm and Substitute Server specifics (eg. remote install) * Conclusion TODO: ... - In the future, we hope to work with Guix maintainers to include this substitute server as one of the provided Guix System defaults.