about summary refs log tree commit diff
path: root/README.org
diff options
context:
space:
mode:
Diffstat (limited to 'README.org')
-rw-r--r--README.org133
1 files changed, 100 insertions, 33 deletions
diff --git a/README.org b/README.org
index c599a31..0b6f8bc 100644
--- a/README.org
+++ b/README.org
@@ -1,58 +1,120 @@
 * guix-bioinformatics
 
 IMPORTANT: this repository lives at https://git.genenetwork.org/guix-bioinformatics/!
-Older packages have been moved to https://git.genenetwork.org/guix-bioinformatics-past/.
 
-Bioinformatics packages for Guix that are used in
-https://genenetwork.org/ and some other places.  See Guix documentation and [[https://gitlab.com/pjotrp/guix-notes/blob/master/HACKING.org][Guix notes]] for
-installing and hacking Guix. Other channels of bioinformatics
-interest can be found at
-
-1. https://github.com/BIMSBbioinfo
-2. https://github.com/UMCUGenetics/guix-additions
-3. https://github.com/ekg/guix-genomics
-
-See [[https://github.com/franzos/awesome-guix][awesome guix]] for a list of channels.
-
-To easily use the packages from this repo, simply add it to your
-`channels` list in ~/.config/guix/channels.scm as described
-[[https://guix.gnu.org/manual/en/html_node/Channels.html][here]]:
+Over 300 older packages have been moved to https://git.genenetwork.org/guix-bioinformatics-past/. Check out the README to see what packages are there.
+
+Over 300 bioinformatics packages for Guix that are used in https://genenetwork.org/ and some other places.
+Mostly targetting genomics, pangenomics and genetics.
+
+** Pangenome tools (pangenomes meta-package)
+
+The =pangenomes= meta-package provides a comprehensive pangenomics toolkit:
+
+| Tool           | Version      | Description                                    |
+|----------------+--------------+------------------------------------------------|
+| pggb           | 0.7.4        | PanGenome Graph Builder pipeline               |
+| wfmash         | 0.14.0       | Whole-genome Fuzzy Mapping and Alignment        |
+| seqwish        | 0.7.11       | Sequence graph induction from alignments        |
+| smoothxg       | 0.8.2        | Graph normalization via partial order alignment  |
+| odgi           | 0.9.0        | Optimized Dynamic Genome/Graph Implementation   |
+| vg             | 1.72.0       | Variation graph toolkit                         |
+| impg           | 0.4.1        | Implicit pangenome graph queries                |
+| minimap2       | 2.28         | Fast pairwise aligner (from Guix upstream)      |
+| bwa-mem2       | 2.3          | Burrows-Wheeler Aligner for short reads         |
+| samtools       | 1.19         | SAM/BAM/CRAM manipulation (from Guix upstream)  |
+| htslib         | 1.21         | HTSlib C library (from Guix upstream)           |
+| bedtools       | 2.31.1       | Genome interval tools (from Guix upstream)      |
+| bcftools       | 1.21         | VCF/BCF manipulation (from Guix upstream)       |
+| vcflib         | 1.0.15       | VCF manipulation library and tools              |
+| vcfbub         | 0.1.0        | VCF bubble popping                              |
+| bandage-ng     | 2026.4.1     | Assembly graph visualizer (Qt6)                 |
+| gfalook        | 0.1.0        | GFA visualization (odgi viz reimplementation)   |
+| pafplot        | 0.1.0        | PAF alignment dotplot renderer                  |
+| wally          | 0.7.1        | Structural variant visualization                |
+| agc            | 2.1          | Assembled Genomes Compressor                    |
+| cigzip         | 0.1.0        | CIGAR compression with tracepoints              |
+| cosigt         | 0.1.7        | Pangenome haplotype genotyping                  |
+| gfainject      | 0.1.0        | BAM-to-GAF graph injection                      |
+| gafpack        | 0.0.0        | GAF coverage vector extraction                  |
+| gfaffix        | 0.2.1        | Walk-preserving graph simplification            |
+| gfautil        | 0.4.0        | GFA format utilities                            |
+| fastga-rs      | 0.1.2        | Fast genome aligner (Rust)                      |
+| fastix         | 0.1.0        | FASTA header prefix renaming (PanSN)            |
+| kfilt          | 0.1.1        | K-mer filtering                                 |
+| meryl          | 1.4.1        | K-mer counting and set operations               |
+| miniprot       | 0.18         | Protein-to-genome aligner                       |
+| pangene        | 1.1          | Gene-level pangenome analysis                   |
+| rtg-tools      | 3.13         | VCF evaluation (vcfeval)                        |
+
+** MEMPANG workshop (mempang-workshop meta-package)
+
+Extends =pangenomes= with R plotting, Python, and general utilities
+for the MEMPANG pangenome workshop tutorials:
+
+| Category       | Packages                                             |
+|----------------+------------------------------------------------------|
+| R packages     | r-ggplot2, r-tidyverse, r-ape, r-ggtree, r-gggenes  |
+| Python         | python, python-igraph, python-pycairo                |
+| Utilities      | graphviz, gnuplot, parallel, pigz, wget, zstd, bc    |
+| QC             | multiqc, mummer                                      |
+
+** GeneNetwork packages
+
+| Package              | Version      | Description                           |
+|----------------------+--------------+---------------------------------------|
+| genenetwork2         | 3.11         | GeneNetwork2 web application          |
+| genenetwork3         | 0.1.0        | GeneNetwork3 REST API                 |
+| gn-auth              | 1.0.1        | GN authentication service             |
+| gn-guile             | 4.0.0        | Guile utilities for GN                |
+| gn-libs              | 0.0.0        | Shared Python libraries               |
+| gn-uploader          | 0.1.1        | Data uploader                         |
+| gemma-wrapper        | 0.99.6       | GEMMA CLI wrapper                     |
+| gemma-gn2            | 0.98.5       | GEMMA for GeneNetwork2                |
+| genecup              | 1.8          | GeneCup literature mining             |
+
+See Guix documentation and [[https://gitlab.com/pjotrp/guix-notes/blob/master/HACKING.org][Guix notes]] for installing and hacking Guix.
+
+See [[https://github.com/franzos/awesome-guix][awesome guix]] for a list of other channels.
+
+To easily use the packages from this repo, simply add it to your `channels` list in ~/.config/guix/channels.scm as described [[https://guix.gnu.org/manual/en/html_node/Channels.html][here]]:
 
 #+BEGIN_SRC scheme
-  ;; channels.scm
+  ;; example channels.scm
   (list (channel
-         (name 'gn-bioinformatics)
+         (name 'guix-bioinformatics)
          (url "https://git.genenetwork.org/guix-bioinformatics")
-         (branch "master")))
+         (branch "main")))
 #+END_SRC
 
-The channel file actually accesses https://git.genenetwork.org/guix-bioinformatics/tree/.guix-channel which pulls other channels and fixates the hashes.
-
 and run /guix pull/ like normal to update your software. E.g.
 
 #+BEGIN_SRC sh
-  guix pull --url=https://codeberg.org/guix/guix -p ~/opt/guix-b0fa1dc --commit=b0fa1dc --channels=channels.scm
+  guix pull --url=https://codeberg.org/guix/guix -p ~/opt/guix-bioinformatics  --channels=channels.scm
 #+END_SRC
 
-(the commit hash can be found from the guix you want to run with /guix -V/, it speeds up installation and makes it reproducible).
+The channel file actually accesses https://git.genenetwork.org/guix-bioinformatics/tree/.guix-channel which pulls other channels and fixates the hashes. The commit hash b0fa1dc can be found from the guix you want to run with /guix -V/, it speeds up installation and makes it reproducible. Note that the upstream channel may override that version.
 
 The latest channel file that is used by our CI/CD you can find at https://ci.genenetwork.org/channels.scm.
 
-This is the recommended way to use the software from this repository and the code snippets in this README assume you have done so. In order to maintain stability, the guix-bioinformatics channel depends on a specific commit of upstream Guix. So, it is recommended to isolate use of the guix-bioinformatics channel in a separate /guix pull/ profile. That is described [[https://issues.genenetwork.org/topics/guix-profiles][here]].
-If you want to make changes to the packages in this repo you can set the GUIX_PACKAGE_PATH to point to the root of this directory before running Guix. E.g.
+Channels are to maintain stability, the guix-bioinformatics channel depends on a specific commit of upstream Guix. So, it is recommended to isolate use of the guix-bioinformatics channel in a separate /guix pull/ profile, described [[https://issues.genenetwork.org/topics/guix-profiles][here]].
+
+You can use the --tune=native switch to optimize performance when installing pangenome tools and gemma.
+
+* Development tips
+
+** Modify the load path
+
+If you want to make changes to the packages in this repo you can set the GUIX_PACKAGE_PATH (or use the -L switch) to point to the root of this directory before running Guix. E.g.
 
 #+BEGIN_SRC bash
       git clone https://git.genenetwork.org/guix-bioinformatics/guix-bioinformatics.git
-      git clone https://gitlab.inria.fr/guix-hpc/guix-past.git
-      export GUIX_PACKAGE_PATH=$PWD/guix-bioinformatics/:$PWD/guix-past/modules
       guix package -A cwl
 #+END_SRC
 
-* Development tips
-
 ** Override individual packages
 
-The cheerful way of overriding a version of a package:
+The cheap and cheerful way of overriding a version of a package:
 
 #+BEGIN_SRC scheme
     (use-modules (guix) (gnu packages emacs))
@@ -73,14 +135,19 @@ We run our own substitution server. Add the key to your machine as
 root with
 
 : guix archive --authorize < tux02-guix-substitutions-public-key.txt
-: guix build -L ~/guix-rust-past-crates/modules/ -L ~/guix-bioinformatics/ -L ~/guix-past/modules/  --substitute-urls="https://cuirass.genenetwork.org https://ci.guix.gnu.org https://bordeaux.guix.gnu.org https://guix.genenetwork.org" hello
+: guix build -L ~/guix-bioinformatics/ --substitute-urls="https://cuirass.genenetwork.org https://ci.guix.gnu.org https://bordeaux.guix.gnu.org https://guix.genenetwork.org" hello
+
+* Testing the build
+
+All important packages are listed in manifest.scm.example. Test with
+
+: guix build -L . -m manifest.scm.example --tune=native
 
-* Important note on AI
+* An important note on AI
 
 The packages in guix-bioinformatics channel are generally written with the help of AI. Only the directory ./gnu/packages contains software that was crafted by hand without the help of AI.
 The packages in this directory align with Guix policy and may be upstreamed to guix trunk.
 
 * LICENSE
 
-These package descriptions (so-called Guix expressions) are
-distributed by the same license as Guix, i.e. GPL3+
+These package descriptions (so-called Guix expressions) are distributed by the same license as Guix, i.e. GPL3+