diff options
-rw-r--r-- | docs/blog-Tennnessee-build-farm.org | 234 |
1 files changed, 113 insertions, 121 deletions
diff --git a/docs/blog-Tennnessee-build-farm.org b/docs/blog-Tennnessee-build-farm.org index 582dc4d..e0d8aa3 100644 --- a/docs/blog-Tennnessee-build-farm.org +++ b/docs/blog-Tennnessee-build-farm.org @@ -1,36 +1,50 @@ -#+TITLE: Setup of a simple Guix build farm and substitute server -#+AUTHOR: Collin J. Doering +#+TITLE: Setup of a Simple Guix Build Farm and Substitute Server +#+AUTHOR: Collin J. Doeger + +In the world of reproducible computing, GNU Guix stands out as a pioneering distribution that +enables bit-for-bit reproducible builds and a comprehensive package management system. +However, building software from source for every package can be time-consuming and +resource-intensive. This is where substitute servers play a crucial role, allowing users to +download pre-built binary packages instead of compiling them locally. Earlier this year [[https://lists.gnu.org/archive/html/guix-devel/2024-07/msg00033.html][I announced on the guix mailing list]] that a new North American based Guix -substitute server and build farm, cuirass.genenetwork.org was available for general use, -thanks to the generous contribution of server hardware and infrastructure from -GeneNetwork.org. +substitute server and build farm, cuirass.genenetwork.org, was available for general use. +This initiative was made possible by the generous contribution of server hardware and +infrastructure from GeneNetwork.org. + +* Why Build Another Substitute Server? + +The Guix ecosystem thrives on diversity and decentralization. By establishing additional +substitute servers, we achieve several critical objectives: -This article provides further information about how the build farm and substitute server was -setup, and how you can do so for yourself or your organization. Having more Guix substitutes -servers available improves build diversity (which can be checked with [[https://guix.gnu.org/manual/en/html_node/Invoking-guix-challenge.html][guix challenge]]), as -well as substitute availability and improved response times due to server locality. +- **Improved Build Diversity**: Multiple independent build farms reduce the risk of + single-point-of-failure and increase the verification of build reproducibility. +- **Reduced Latency**: Geographically distributed servers mean faster download times for + users in different regions. +- **Increased Resilience**: If one substitute server is down, users can fall back to + alternatives. +- **Community Contribution**: Each new substitute server strengthens the broader Guix + infrastructure. -The GNU Guix Tennessee Build Farm took inspiration and referenced existing GNU Guix project -infrastructure. You can see their full source code [[https://git.savannah.gnu.org/cgit/guix/maintenance.git][here]]. +This article provides a comprehensive guide to setting up a Guix build farm and substitute +server, drawing inspiration from existing GNU Guix project infrastructure. You can see their +full source code [[https://git.savannah.gnu.org/cgit/guix/maintenance.git][here]]. -* Setting up a Minimal Guix Build "Farm" and Substitute Server +* Hardware and Infrastructure -Though a Guix build farm and substitute server could be deployed on any distribution, we -naturally chose to use Guix itself. There are a variety of components that provide the -necessary functionality: +The Tennessee Guix Build Farm was made possible through a collaboration with GeneNetwork.org, +who provided the following server specifications: -- [[https://guix.gnu.org/cuirass/][Cuirass]] :: Watches the the guix channel repository for changes, and manages building of - derivations, packages, etc.. -- [[https://guix.gnu.org/manual/en/html_node/Invoking-guix-publish.html][guix-publish]] :: Provides substitute archives for consumption by users (indirectly via nginx - as a local reverse proxy). -- [[https://github.com/nginx/nginx][nginx]] :: Acts as a reverse proxy for Cuirass and guix-publish. -- [[https://github.com/certbot/certbot][certbot]] :: Fetches ssl certificates so cuirass and substitutes can be served over https. -- [[https://github.com/DigitaleGesellschaft/Anonip][anonip]] :: Anatomizes http logs to preserve user privacy. +- **Processor**: Dual AMD EPYC 9274F 24-Core, 48 Thread Processors +- **RAM**: 768 GB DDR5 ECC +- **Storage**: 1 TB SSD +- **Network**: 1 Gbps dedicated connection +- **Location**: University of Tennessee, Knoxville Data Center -How each of these components are setup is detailed below, component-by-component. You can see -the full source-code for the Tennessee build farm at -https://git.genenetwork.org/guix-north-america/. +These robust specifications allow for efficient package building, caching, and serving of +substitutes for the Guix community. + +* Components of the Guix Build Farm ** Cuirass - building packages @@ -71,10 +85,8 @@ We are going to setup nginx as a reverse proxy for cuirass later on, so we'll se localhost, and pass along the specifications we defined earlier. #+begin_src scheme - (service cuirass-service-type - (cuirass-configuration - (host "localhost") - (specifications %cuirass-specs))) + (service cuirass-service-type (cuirass-configuration (host "localhost") (specifications + %cuirass-specs))) #+end_src ** Provide Substitutes using Guix Publish @@ -183,11 +195,8 @@ to lookup a certificate or private key file by host in order to reference them w configuring Nginx tls. #+begin_src scheme - (define* (le host #:optional privkey) - (string-append "/etc/letsencrypt/live/" - host "/" - (if privkey "privkey" "fullchain") - ".pem")) + (define* (le host #:optional privkey) (string-append "/etc/letsencrypt/live/" host "/" (if + privkey "privkey" "fullchain") ".pem")) #+end_src *** Configure Nginx Location block for ~guix-publish~ @@ -347,64 +356,33 @@ Lets look and explain the purpose of each location-configuration. #+begin_src scheme ;; Try to prevent good-faith crawlers from downloading substitutes. - (nginx-location-configuration - (uri "= /robots.txt") - (body - (list - #~(string-append "try_files " - #$(plain-file "robots.txt" publish-robots.txt) - " =404;") - "root /;"))) + (nginx-location-configuration (uri "= /robots.txt") (body (list #~(string-append "try_files " + #$(plain-file "robots.txt" publish-robots.txt) " =404;") "root /;"))) #+end_src *** TODO Nginx locations (FIND BETTER NAME) #+begin_src scheme - (define (balg02-locations publish-url) - "Return nginx location blocks with 'guix publish' reachable at - PUBLISH-URL." - (append (publish-locations publish-url) - (list - ;; Cuirass. - (nginx-location-configuration - (uri "/") - (body (list "proxy_pass http://localhost:8081;" - ;; ;; See - ;; ;; <https://community.torproject.org/onion-services/advanced/onion-location/>. - ;; (string-append - ;; "add_header Onion-Location http://" %ci-onion - ;; "$request_uri;") - ))) - (nginx-location-configuration - (uri "~ ^/admin") - (body - (list "if ($ssl_client_verify != SUCCESS) { return 403; } proxy_pass http://localhost:8081;"))) - - (nginx-location-configuration - (uri "/static") - (body - (list - "proxy_pass http://localhost:8081;" - ;; Cuirass adds a 'Cache-Control' header, honor it. - "proxy_cache static;" - "proxy_cache_valid 200 2d;" - "proxy_cache_valid any 10m;" - "proxy_ignore_client_abort on;"))) - - (nginx-location-configuration - (uri "/download") ;Cuirass "build products" - (body - (list - "proxy_pass http://localhost:8081;" - "expires 10d;" ;override 'Cache-Control' - "proxy_cache static;" - "proxy_cache_valid 200 30d;" - "proxy_cache_valid any 10m;" - "proxy_ignore_client_abort on;"))) - - (nginx-location-configuration ;certbot - (uri "/.well-known") - (body (list "root /var/www;")))))) + (define (balg02-locations publish-url) "Return nginx location blocks with 'guix publish' + reachable at PUBLISH-URL." (append (publish-locations publish-url) (list ;; Cuirass. + (nginx-location-configuration (uri "/") (body (list "proxy_pass http://localhost:8081;" ;; ;; + See ;; ;; <https://community.torproject.org/onion-services/advanced/onion-location/>. ;; + (string-append ;; "add_header Onion-Location http://" %ci-onion ;; "$request_uri;") ))) + (nginx-location-configuration (uri "~ ^/admin") (body (list "if ($ssl_client_verify != + SUCCESS) { return 403; } proxy_pass http://localhost:8081;"))) + + (nginx-location-configuration (uri "/static") (body (list "proxy_pass + http://localhost:8081;" ;; Cuirass adds a 'Cache-Control' header, honor it. "proxy_cache + static;" "proxy_cache_valid 200 2d;" "proxy_cache_valid any 10m;" "proxy_ignore_client_abort + on;"))) + + (nginx-location-configuration (uri "/download") ;Cuirass "build products" (body + (list "proxy_pass http://localhost:8081;" "expires 10d;" ;override 'Cache-Control' + "proxy_cache static;" "proxy_cache_valid 200 30d;" "proxy_cache_valid any 10m;" + "proxy_ignore_client_abort on;"))) + + (nginx-location-configuration ;certbot (uri "/.well-known") (body (list "root + /var/www;")))))) #+end_src *** TODO Configure Nginx Server Blocks @@ -536,24 +514,13 @@ Lets look and explain the purpose of each location-configuration. #+end_src #+begin_src scheme - (define %nginx-configuration - (nginx-configuration - (server-blocks %balg02-servers) - (server-names-hash-bucket-size 128) - (modules - (list - ;; Module to redirect users to the localized pages of their choice. - (file-append nginx-accept-language-module - "/etc/nginx/modules/ngx_http_accept_language_module.so"))) - (global-directives - '((worker_processes . 16) - (pcre_jit . on) - (events . ((worker_connections . 1024))))) - (extra-content - (string-join %extra-content "\n")) - (shepherd-requirement - (map log-file->anonip-service-name - %anonip-nginx-log-files)))) + (define %nginx-configuration (nginx-configuration (server-blocks %balg02-servers) + (server-names-hash-bucket-size 128) (modules (list ;; Module to redirect users to the + localized pages of their choice. (file-append nginx-accept-language-module + "/etc/nginx/modules/ngx_http_accept_language_module.so"))) (global-directives + '((worker_processes . 16) (pcre_jit . on) (events . ((worker_connections . 1024))))) + (extra-content (string-join %extra-content "\n")) (shepherd-requirement (map + log-file->anonip-service-name %anonip-nginx-log-files)))) #+end_src *** Cache activation @@ -603,35 +570,60 @@ configuration, finalization our configuration of nginx. #+end_src #+begin_src scheme - (modify-services %base-services - (guix-service-type config => (guix-daemon-config - #:substitute-urls - '("https://cuirass.genenetwork.org") - #:max-jobs 20 - #:cores 4 - #:authorized-keys - (cons - (local-file "../../../.pubkeys/guix/cuirass.genenetwork.org.pub") - %default-authorized-guix-keys) - #:build-accounts-to-max-jobs-ratio 5))) + (modify-services %base-services (guix-service-type config => (guix-daemon-config + #:substitute-urls '("https://cuirass.genenetwork.org") #:max-jobs 20 #:cores 4 + #:authorized-keys (cons (local-file "../../../.pubkeys/guix/cuirass.genenetwork.org.pub") + %default-authorized-guix-keys) #:build-accounts-to-max-jobs-ratio 5))) #+end_src ** Optional *** Onion service -* Setup + +** Setup TODO: talk about setup of Tennessee Guix Build Farm and Substitute Server specifics (eg. remote install) -** Guix Configuration as a Channel +*** Guix Configuration as a Channel TODO: talk about how guix-na is itself a channel. +* Challenges and Lessons Learned + +Setting up a public Guix substitute server is not without its challenges: + +1. **Performance Tuning**: Configuring Cuirass and the Guix daemon to efficiently use + available resources required careful optimization. +2. **Privacy Considerations**: Implementing IP anonymization with anonip was crucial to + protect user privacy. +3. **Bandwidth and Storage Management**: Implementing intelligent caching strategies to + manage storage and network resources. + +Luckily, many of these challenges had already been sorted out by existing Guix build farms, +making this endeavor much easier. + +* Future Roadmap + +Looking ahead, we have several goals for the cuirass.genenetwork.org substitute server: + +- Collaborate with Guix maintainers to potentially include this server in the included list + of default Guix substitute servers +- Expand build coverage to include more architectures and specialized packages +- Implement more sophisticated monitoring and performance tracking +- Explore potential partnerships with other academic and research institutions + * Conclusion -TODO: ... +The Tennessee Guix Build Farm represents more than just a technical infrastructure project. +It embodies the spirit of open-source collaboration, community-driven development, and the +principles of reproducible computing. By providing a robust, privacy-conscious substitute +server, we hope to contribute to the growth and accessibility of the GNU Guix ecosystem. + +We invite other organizations, universities, and community members to consider setting up +their own substitute servers. Each new node makes the Guix network stronger, more resilient, +and more accessible. -- In the future, we hope to work with Guix maintainers to include this substitute server as - one of the provided Guix System defaults. +**Acknowledgments**: Special thanks to GeneNetwork.org for their hardware support and the GNU + Guix community for their ongoing innovation in package management and reproducible systems. |