8 files changed, 310 insertions, 181 deletions
diff --git a/.gitignore b/.gitignore
index ef520c8..d915f60 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,3 +1,8 @@
+build/
+PanGemma.make
+gemma.make
+gemmalib.make
+Makefile
 *.o
 *.tar.gz
 src/Eigen
@@ -12,3 +17,4 @@ doc/manual.log
 doc/manual.out
 doc/manual.toc
 contrib/
+.aider*
diff --git a/Makefile b/Makefile.orig
index 1009a6e..1009a6e 100644
--- a/Makefile
+++ b/Makefile.orig
diff --git a/README.md b/README.md
index e13ca36..5d6f3fa 100644
--- a/README.md
+++ b/README.md
@@ -2,7 +2,7 @@
 
 This repository is used to rewrite and modernize the original GEMMA tool. The idea it to upgrade the software, but keeping it going using ideas from Hanson and Sussman's book on *Software Design for Flexibility: How to Avoid Programming Yourself into a Corner*. It is work in progress (WIP). For more information see [PanGEMMA design](./doc/code/pangemma.md)
 
-GEMMA is the original software toolkit for fast application of linear mixed models (LMMs) and related models to genome-wide association studies (GWAS) and other large-scale data sets. You can find the original code on [github](https://github.com/genetics-statistics/GEMMA).
+GEMMA is the original software toolkit for fast application of linear mixed models (LMMs) and related models to genome-wide association studies (GWAS) and other large-scale data sets. You can find the original code on [github](https://github.com/genetics-statistics/GEMMA). It may even build from this repo for the time being.
 
 Check out [RELEASE-NOTES.md](./RELEASE-NOTES.md) to see what's new in each release.
 
diff --git a/doc/code/pangemma.md b/doc/code/pangemma.md
index 1c3fed9..013816a 100644
--- a/doc/code/pangemma.md
+++ b/doc/code/pangemma.md
@@ -1,43 +1,14 @@
 # PanGEMMA
 
-We are rewriting and modernizing the much beloved GEMMA tool that is,
-for example, core to GeneNetwork.org.
-The idea is to upgrade the software, and keep it going using ideas
-from Hanson and Sussman's book on *Software Design for Flexibility:
-How to Avoid Programming Yourself into a Corner*.
-This is not the first attempt, in fact quite a few efforts have
-started, but none really finished!
-
-We want to keep the heart of GEMMA beating while upgrading the
-environment taking inspiration from fetal heart development: The human
-heart is one of the first organs to form and function during
-embryogenesis. By the end of gestational week 3, passive oxygen
-diffusion becomes insufficient to support metabolism of the developing
-embryo, and thus the fetal heart becomes vital for oxygen and nutrient
-distribution. The initiation of the first heart beat via the
-*primitive heart tube* begins at gestational day 22, followed by
-active fetal blood circulation by the end of week 4. The start of
-early heart development involves several types of progenitor cells
-that are derived from the mesoderm, proepicardium, and neural crest.
-This eventually leads to the formation of the 4-chambered heart by
-gestational week 7 via heart looping and complex cellular interactions
-in utero (e.g., [Tan and
-Lewandowski](https://doi.org/10.1159/000501906)).
-
-What we will do is create components and wire them together, allowing
-for sharing RAM between components. Each component may have multiple
-implementations. We will introduce a DSL for orchestration and we may
-introduce a propagator network to run components in parallel and test
-them for correctness. At the same time, the core functionality of
-GEMMA will keep going while we swap components in and out. See also
-[propagators](https://groups.csail.mit.edu/mac/users/gjs/propagators/)
-and [examples](https://github.com/namin/propagators/blob/master/examples/multiple-dwelling.scm).
-
-Propagators are [Unix pipes on steroids](https://github.com/pjotrp/ruby-propagator-network/blob/main/README.md).
-
-We want PanGEMMA to be able to run on high-performance computing (HPC)
-architectures, including GPUs. This implies the core project can have
-few dependencies and should easily compile from C.
+We are rewriting and modernizing the much beloved GEMMA tool that is core to GeneNetwork.org. The idea is to upgrade the software, and keep it going using ideas from Hanson and Sussman's book on *Software Design for Flexibility: How to Avoid Programming Yourself into a Corner*. This is not the first attempt, in fact quite a few efforts have started, but none have really finished! For some reason we painted ourselves into a corner (indeed).
+
+Importantly, we want to keep the heart of GEMMA beating while upgrading the environment taking inspiration from fetal heart development: The human heart is one of the first organs to form and function during embryogenesis. By the end of gestational week 3, passive oxygen diffusion becomes insufficient to support metabolism of the developing embryo, and thus the fetal heart becomes vital for oxygen and nutrient distribution. The initiation of the first heart beat via the *primitive heart tube* begins at gestational day 22, followed by active fetal blood circulation by the end of week 4. The start of early heart development involves several types of progenitor cells that are derived from the mesoderm, proepicardium, and neural crest. This eventually leads to the formation of the 4-chambered heart by gestational week 7 via heart looping and complex cellular interactions in utero (e.g., [Tan and Lewandowski](https://doi.org/10.1159/000501906)).
+
+What we will do is create components and wire them together, allowing for sharing memory between components using Linux mmap techniques --- mostly using lmdb that is supported on non-Linux platforms. Each component may have multiple implementations. We will introduce a DSL for orchestration and we may introduce a propagator network to run components in parallel and test them for correctness. At the same time, the core functionality of GEMMA will keep going while we swap components in and out. See also [propagators](https://groups.csail.mit.edu/mac/users/gjs/propagators/) and [examples](https://github.com/namin/propagators/blob/master/examples/multiple-dwelling.scm).
+
+I wrote a piece with examples on propagators too. Propagators are [Unix pipes on steroids](https://github.com/pjotrp/ruby-propagator-network/blob/main/README.md).
+
+We want PanGEMMA to be able to run on high-performance computing (HPC) architectures, including GPUs. This implies the core project can have few dependencies and should easily compile from C.
 
 # Innovation
 
@@ -47,49 +18,28 @@ few dependencies and should easily compile from C.
 
 ## Breaking with the past
 
-The original gemma source base is considered stable and will be
-maintained - mostly to prevent bit rot. See
-https://github.com/genetics-statistics/GEMMA. To move forward we
-forked pangemma to be able to break with the past.
+The original gemma source base is considered stable and will be maintained - mostly to prevent bit rot. See https://github.com/genetics-statistics/GEMMA. To move forward we forked pangemma to be able to break with the past.
 
-Even so, pangemma is supposed to be able to run the same steps as the
-original gemma. And hopefully improve things.
+Even so, pangemma is supposed to be able to run the same steps as the original gemma. And hopefully improve things.
 
 ## Adapting the old gemma code
 
-For running the old gemma code we need to break up the code base into
-pieces to run in a propagator network (see below).
-Initially we can use the file system to pass state around. That will
-break up the implicit global state that gemma carries right now and
-makes working with the code rather tricky.
-Note that we don't have the goal of making gemma more efficient
-because people can still use the old stable code base.
-Essentially we are going to add flags to the binary that will run
-gemma partially by exporting and importing intermediate state.
+For running the old gemma code we need to break up the code base into pieces to run in a propagator network (see below). Initially we can use the file system to pass state around. That will break up the implicit global state that gemma carries right now and makes working with the existing code base rather tricky. Note that we don't have the initial goal of making gemma more efficient because people can still use the old stable code base. Essentially we are going to add flags to the binary that will run gemma partially by exporting and importing intermediate state.
 
-Later, when things work in a propagator network, we can create
-alternatives that pass state around in memory.
+Later, when things work in a propagator network, we can create alternatives that pass state around in memory and make pangemma more efficient.
 
 # A simple propagator network
 
-We will create cells that hold basic computations and functions. We
-won't do a full propagator setup, though we may do a full
-implementation later.
-For now we use a network of cells - essentially a dependency graph of
-computations. Cells can tell other cells that they require them and
-that allows for alternate paths. E.g. to create a kinship matrix:
+We will create cells that hold basic computations and functions. We won't do a full propagator setup, though we may do a full implementation later. For now we use a network of cells - essentially a dependency graph of computations. Cells hold state. Cells can tell other cells that they require them and that allows for alternate paths. E.g. to create a kinship matrix:
 
 ```
 (define-cell genotypes)
-(define-cell kinship-matrix (e:kinship genotypes)
+(define-cell kinship-matrix (e:calc-kinship genotypes)
 (run)
 (content kinship-matrix)
 ```
 
-essentially e:kinship gets run when genotypes are available. It is
-kinda reversed programming. Now say we want to add an input and a
-filter:
-
+essentially e:calc-kinship gets run when genotypes are available. Now say we want to add an input and a filter:
 
 ```
 (define-cell bimbam-genofile)
@@ -122,84 +72,44 @@ runs one path and
 
 will run both.
 
-This simple example shows how complex running logic can be
-described without (1) predicting how people will use the software and
-(2) no complex if/then statements.
+This simple example shows how complex running logic can be described without (1) predicting how people will use the software and (2) no complex if/then statements.
 
-Why does this matter for GEMMA? It will allow us to run old parts of
-GEMMA as part of the network in addition to new parts - and
-potentially validate them. It also allows us to create multiple
-implementations, such as for CPU and GPU, that can run in parallel and
-validate each other's outcomes.
+Why does this matter for GEMMA? It will allow us to run old parts of GEMMA as part of the network in addition to new parts - and potentially validate them. It also allows us to create multiple implementations, such as for CPU and GPU, that can run in parallel and validate each other's outcomes.
 
 ## Create the first cells
 
-Let's start with default GEMMA behaviour and create the first cells to
-get to exporting the kinship-matrix above.
+Let's start with default GEMMA behaviour and create the first cells to get to exporting the kinship-matrix above.
 
 ```
 (content kinship-matrix)
 ```
-We'll break this down by inspecting cell levels next. The genofile
-cells contain a file name (string).
-When the file is read we will capture the state in a file and carry
-the file name of the new file forward.
-In this setup cells simply represent file names (for state).
-This allows us to simply read and write files in the C code.
-Later, when we wire up new propagators we may carry this state in
-memory. The whole point of using this approach is that we really don't
-have to care!
-
-Our first simple implementation of the cell will simply contain a
-string referring to a file.
-Cells can be works-in-progress and incrementally improved.
+We'll break this down by inspecting cell levels next. The genofile cells contain a file name (string). When the file is read we will capture the state in a file and carry the file name of the new file forward. In this setup cells simply represent file names (for state). This allows us to simply read and write files in the C code. Later, when we wire up new propagators we may carry this state in memory. The whole point of using this approach is that we really don't have to care!
+
+Our first simple implementation of the cell will simply contain a string referring to a file. Cells can be works-in-progress and incrementally improved.
 
 ## Create propagators
 
-A propagator contains a list of inputs and an output cell. So we wire
-up the graph by adding inputs to propagators.
-Every propagator has state (too), that is, it may be idle, computing, or done.
+A propagator contains a list of inputs and an output cell. So we wire up the graph by adding inputs to propagators. Every propagator has state (too), that is, it may be idle, computing, or done.
 
 ## The runner
 
-The runner visits the list of propagators and checks whether the
-inputs are complete and whether they have changed. On detecting a
-change, a computation has to be triggered to update the output cell.
+The runner visits the list of propagators and checks whether the inputs are complete and whether they have changed. On detecting a change, a computation has to be triggered to update the output cell.
 
 ## Setting check points in GEMMA
 
-GEMMA is quite stateful in its original design. We want to break the
-work up into chunks setting 'check points'. For example the actual
-kinship multiplication could start as 'start-compute-kinship' and end
-with 'compute-kinship' check points. To not have to deal with state
-too much we can simply let GEMMA run from the start of the program
-until 'compute-kinship' to have a kinship-propagator. The output will
-be a kinship file. Similarly we can run until 'filter-genotypes' that
-is a step earlier. The output of these propagators can be used by
-other PanGEMMA propagators as input for comparison and continuation.
-All the original GEMMA does is act as a reference for alternative
-implementation of these chunks. Speed is not a concern though there
-may be opportunities to start compute after some of these check-points
-(using intermediate output) down the line.
+GEMMA is rather stateful in its original design. We want to break the work up into chunks setting 'check points'. For example the actual kinship multiplication could start as 'start-compute-kinship' and end with 'compute-kinship' check points. To not have to deal with state too much we can simply let GEMMA run from the start of the program until 'compute-kinship' to have a kinship-propagator. The output will be a kinship file. Similarly we can run until 'filter-genotypes' that is a step earlier. The output of these propagators can be used by other PanGEMMA propagators as input for comparison and continuation. All the original GEMMA does is act as a reference for alternative implementation of these chunks. Speed is not a concern though there may be opportunities to start compute after some of these check-points (using intermediate output) down the line.
 
 So, let's start with a first check point implementation for 'read-bimbam-file'.
 
 ## read-bimbam-file
 
-Reading the bimbam file happens in the `ReadFile_bim' function in
-`gemma_io.cpp'. Of course all it does is read a file - which is the
-same as any output. But just for the sake of a simple pilot we'll add
-the check point at the end of the function that will exit GEMMA.
-We'll add a CLI switch `-checkpoint read-geno-file' which will force the exit.
+Reading the bimbam file happens in the `ReadFile_bim' function in `gemma_io.cpp'. Of course all it does is read a file - which is the same as any output. But just for the sake of a simple pilot we'll add the check point at the end of the function that will exit GEMMA. We'll add a CLI switch `-checkpoint read-geno-file' which will force the exit.
 
 ```C++
 checkpoint("read-geno-file",file_geno);
 ```
 
-It passes in the outputfile (the infile in this case), that is used to
-feed the calling propagator. Some of the outfiles may be composed of
-multiple outputs - in that case we may add filenames. And exits with:
-
+It passes in the outputfile (the infile in this case), that is used to feed the calling propagator. Some of the outfiles may be composed of multiple outputs - in that case we may add filenames. And exits with:
 ```
 **** Checkpoint reached: read-geno-file (normal exit)
 ```
@@ -238,46 +148,69 @@ It should be with a propagator infrastructure. The fun part is that both paths c
 
 ## Adding some propagator plumbing
 
-I am highly resistant to writing C++ code. I probably have wrote most of lines of code in my life in C++, so I should know ;)
-As we are still in the proof-of-concept stage I am going to use Ruby for the first plumbing (I do like that language, despite its warts) and maybe throw some scheme in for writing the new lmdb support. One fun side of propagators is that I don't need to worry too much about languages as people can trivially update any propagator in any way they like. We'll replace the Ruby propnet engine into something that can handle parallelism, e.g., when we want to start using actors to run computations.
+I have become highly resistant to writing C++ code. I probably have wrote most of lines of code in my life in C++, so I should know ;)
+As we are still in the proof-of-concept stage I may use Ruby for the first plumbing (I do like that language, despite its warts) and maybe throw some scheme in for writing the new lmdb support. One fun side of propagators is that I don't need to worry too much about languages as people can trivially update any propagator in any way they like. We'll replace the Ruby propnet engine into something that can handle parallelism, e.g., when we want to start using actors to run computations.
 
 Alright, to duplicate above filtering routine we are going to add an alternative path for GEMMA's read-geno-file, splitting the two tasks into two steps:
 
     read-geno-file (with filtering)              -> list-of-filtered-genotypes
     rb-read-geno-file -> rb-filter-geno-file     ->     ''
 
-'read-geno-file' translates to GEMMA's read-geno-file and rb-read-geno-file is the alternate path. list-of-filtered-genotypes will be a file that we can compare between the two paths.
+'read-geno-file' translates to GEMMA's read-geno-file (maybe I should name it gemma-read-geno-file) and rb-read-geno-file is the alternate path. list-of-filtered-genotypes will be a file that we can compare between the two paths.
 These steps should be so simple that anyone can replace them with, say, a python version. So, what can go wrong?
 
+# Adding our first converter
+
+We recently introduced an lmdb version of the genotype file. This new genotype format stores both normal and transposed forms of the genotype matrix, so they can be read directly from disk in an optimal version. lmdb is memory mapped and can give great performance.
 
 # Other
 
 ## Example
 
-I created a very minimalistic example in Ruby with a simple
-round-robin scheduler:
-
-=> https://github.com/pjotrp/ruby-propagator-network/blob/main/propagator.rb
+I created minimalistic examples in Ruby with 0MQ and [simple round-robin scheduler](https://github.com/pjotrp/ruby-propagator-network/blob/main/)
 
 ## Running tasks in parallel
 
-In principle propnets make it trivially easy to run tasks in parallel.
-When inputs are complete the propagator goes into a
-'compute' state and a process can be run (forked or on PBS) that
-executes a command. The output file can be picked up to make sure the
-propagator completes and switches to the 'done' state. Note that
-actors and goblins would be perfect parallel companions to propagators
-with the potential advantage of keeping results in RAM.
-
-We are not going there right now, but it is one of the important
-reasons to build this setup.
-
-## Why guile and not ruby or python?
-
-Above example in Ruby is rather nice and we can build on that initially.
-Ruby's multithreading capabilities (and that of Python), however, are
-hampered by the layout of the interpreters and modules that can not be
-re-entered. Even with the slow disappearance of the global interpreter
-lock (GIL) these languages don't do great in that area. You can work
-around it by using copy-on-write processes (like I did with the really
-fast bioruby-vcf), but that gets clunky fast.
+In principle propnets make it trivially easy to run tasks in parallel. When inputs are complete the propagator goes into a 'compute' state and a process can be run (forked or on PBS) that executes a command. The output file can be picked up to make sure the propagator completes and switches to the 'done' state. Note that actors and goblins would be perfect parallel companions to propagators with the potential advantage of keeping results in RAM. We are not going there right now, but it is one of the important reasons to build this setup.
+
+## Why guile and not Ruby or Python?
+
+Above example in Ruby is rather nice and we can build on that initially. Ruby's multithreading capabilities (and that of Python), however, are hampered by the layout of the interpreters and modules that can not be re-entered. Even with the slow disappearance of the global interpreter lock (GIL) these languages don't do great in that area. You can work around it by using copy-on-write processes (like I did with the really fast bioruby-vcf), but that gets clunky fast.
+
+The main reason to choose Guile over Ruby (or Python) is the REPL. The debugging experience of the Lisp REPL is hard to beat. What we want is GEMMA as resident 'server' that we can develop against.
+
+What we want to do in development/debugging is fire up GEMMA so it starts listening. Not exactly GEMMA as a server, but still responding to our requests. Next we want to go into a continuous cycle where we change some data and get a result. Next be able to inspect the result and either go back or continue with a next computation. This REPL-centric programming is not only addictive as a developer-experience, but also highly efficient in terms of developer-time.
+
+To make software listen interactively we need to be able to update it 'in place'. With Lisp code this is straightforward because partial code can get re-evaluated or recompiled on-the-fly.
+With C++ code this is only possible if we make it listen to a socket and restart the socket server. And here
+we are talking message passing!
+
+## Why Ruby and not guile?
+
+OK, I have a confession to make. Reading Lisp code does not come that natural to me. I find writing Lisp easier than reading Lisp(!) Maybe it is because I have been reading Algol-type languages all my life, or maybe because Ruby is just a better fit to my brain. Where Lisp is definitely the greater language, I just find it harder to parse. Even my own code! I met Matz once and he told me that Ruby was known as Matz's own Lisp. So, there we are. I need the Ruby syntax oddities as little helpers to make my brain disentangle code.
+
+Ruby's REPL, however, is not that useful.
+
+## The art of message passing
+
+Messaging to the rescue. If all code responds to messages we can create and destroy servers at will. No matter what language they are written in. That also ties in with propagator networks. A propagator cell will be fired up and checks its inputs. Based on the inputs it does a (partial) computation. When the computation is completed it switches to a 'done' state. The done state signals downstream cells to start their computations using the cell values as inputs.
+
+In a REPL we can evaluate these cells. That is where Lisp comes in. But as these cells have a transparent messaging mechanism we should be able to use them in any language. That means we have a choice of languages to develop the full framework.
+
+## ZeroMQ (0MQ)
+
+ZeroMQ provides a minimalistic message passing interface in most languages. It will allow us to create cells and wire up a propagator network.
+
+## Sharing data in RAM
+
+There are several methods for sharing data between threads and processes. First we can first use Linux copy on write for instantiated processes. The Linux kernel als has shared memory support and, more interesting, memory mapping support --- which is also used in lmdb. Lmdb is multi-platform and works on non-POSIX systems. It is very fast, but may need tweaking, for example by disabling read-ahead with MDB_NORDAHEAD. We'll get to that.
+## Build system
+
+The build system for development relies on Guix. See the header of [guix.scm](../guix.scm).
+Guix handles depencies well, so it is a great base to work from.
+
+Build systems have to target different platforms. pangemma will mostly be run on Linux on AMD64, but there are always people who want more and it requires support from the build system. Also we may need targets for debug and architecture optimized builds. Some of that may be handled through Guix.
+
+For the actual build system we may roll our own.
+All build tools I know are too messy, including CMake, meson and make itself. So, as a real hacker, we
+can reinvent the wheel. One starting point is Arun's [g-exp blog](https://systemreboot.net/post/g-expressions-makings-of-a-make-killer) though it would require a non-Guix way of building things.
diff --git a/guix.scm b/guix.scm
index c9c0410..a99d66c 100644
--- a/guix.scm
+++ b/guix.scm
@@ -1,53 +1,119 @@
 ;; To use this file to build HEAD of gemma:
 ;;
-;;   guix build -f guix.scm
+;;   guix build -f guix.scm       # default builds pangemma-git
+;;   guix build -L . gemma-git    # builds original (still)
 ;;
 ;; To get a development container (e.g., run in emacs shell).
 ;;
-;;   guix shell -C -D -f guix.scm
+;;   guix shell -C -D -F -v 3 -L . pangemma-shell-git  # pangemma-shell-git
+;;   guix shell -C -D -F -v 3 -L . gemma-git           # for specific packages
 ;;
 
-(use-modules
-  ((guix licenses) #:prefix license:)
-  (guix gexp)
-  (guix packages)
-  (guix git-download)
-  (guix build-system gnu)
-  (gnu packages algebra)
-  (gnu packages base)
-  (gnu packages compression)
-  ;; (gnu packages bioinformatics)
-  (gnu packages build-tools)
-  (gnu packages check)
-  ;; (gnu packages curl)
-  (gnu packages gdb)
-  (gnu packages guile)
-  ;; (gnu packages llvm)
-  (gnu packages maths)
-  ;; (gnu packages ninja)
-  ;; (gnu packages parallel)
-  (gnu packages perl)
-  ;; (gnu packages perl6)
-  ;; (gnu packages ruby)
-  (gnu packages pkg-config)
-  ;; (pjotr packages openblas) ;; we use this for the static builds
-  ;; (gnu packages shell)  ;; for shunit2
-  (srfi srfi-1)
-  (ice-9 popen)
-  (ice-9 rdelim))
+(define-module (guix)
+  #:use-module ((guix licenses) #:prefix license:)
+  #:use-module (guix gexp)
+  #:use-module (guix packages)
+  #:use-module (guix git-download)
+  #:use-module (guix build-system gnu)
+  #:use-module (gnu packages algebra)
+  #:use-module (gnu packages base)
+  #:use-module (gnu packages build-tools)
+  #:use-module (gnu packages compression)
+  #:use-module (gnu packages commencement)
+  #:use-module (gnu packages check)
+  #:use-module (gnu packages databases)
+  #:use-module (gnu packages gdb)
+  #:use-module (gnu packages guile)
+  #:use-module (gnu packages guile-xyz)
+  #:use-module (gnu packages maths)
+  #:use-module (gnu packages ninja)
+  #:use-module (gnu packages perl)
+  #:use-module (gnu packages pkg-config)
+  #:use-module (gnu packages ruby)
+  #:use-module (srfi srfi-1)
+  #:use-module (ice-9 popen)
+  #:use-module (ice-9 rdelim))
 
 (define %source-dir (dirname (current-filename)))
 
 (define %git-commit
-    (read-string (open-pipe "git show HEAD | head -1 | cut -d ' ' -f 2" OPEN_READ)))
+  (read-string (open-pipe "git describe --always --tags --long|tr -d $'\n'" OPEN_READ)))
 
-(define %gemma-version
-    (read-string (open-pipe "cat VERSION" OPEN_READ)))
+(define %pangemma-version
+    (read-string (open-pipe "cat VERSION|tr -d $'\n'" OPEN_READ)))
 
+(define-public pangemma-base-git
+  "Pangemma base build package"
+  (package
+    (name "pangemma-git")
+    (version (git-version %pangemma-version "HEAD" %git-commit))
+    (source (local-file %source-dir #:recursive? #t))
+    (build-system gnu-build-system)
+    (inputs
+     (list gsl
+           openblas
+           guile-3.0
+           `(,guile-3.0 "debug")
+           guile-lmdb
+           lmdb
+           ninja
+           ruby
+           zlib))
+    (propagated-inputs
+     (list
+      `("guile" ,guile-3.0-latest)
+      `("guile-debug" ,guile-3.0-latest "debug")))
+
+    ;; ("gsl-static" ,gsl-static)
+    ;; ("zlib:static" ,zlib "static")
+    (arguments
+     `(#:phases
+       (modify-phases %standard-phases
+                      (delete 'configure)
+                      (delete 'validate-runpath)
+                      (add-before 'build 'bin-mkdir
+                                  (lambda _
+                                    (mkdir-p "bin")
+                                    ))
+                      (replace 'install
+                               (lambda* (#:key outputs #:allow-other-keys)
+                                 (let ((out (assoc-ref outputs "out")))
+                                   (install-file "bin/gemma" (string-append out "/bin"))))))
+       #:tests? #t
+       #:parallel-tests? #f))
+    (home-page "https://git.genenetwork.org/pangemma/")
+    (synopsis "Tool for genome-wide efficient mixed model association")
+    (description "New version of Genome-wide Efficient Mixed Model Association (PANGEMMA)
+provides a standard linear mixed model resolver with application in
+genome-wide association studies (GWAS).")
+    (license license:gpl3)))
+
+(define-public pangemma-shell-git
+  "Shell version for development"
+  (package
+    (inherit pangemma-base-git)
+    (name "pangemma-shell-git")
+    (build-system gnu-build-system)
+    (propagated-inputs
+     (modify-inputs (package-inputs pangemma-base-git)
+                    (append which binutils coreutils gcc-toolchain premake5 gnu-make ;; for the shell
+                 )))
+    (arguments
+     `(#:phases (modify-phases %standard-phases
+                               (delete 'configure)
+                               (delete 'build)
+                               (delete 'package)
+                               (delete 'check)
+                               (delete 'install))))
+    (description "Pangemma shell for development")
+    ))
+
+;; ---- legacy build -----------------------------------------------------------------
 (define-public gemma-git
+  "Original legacy gemma -- for as long as it compiles"
   (package
     (name "gemma-git")
-    (version (git-version "0.98.5" "HEAD" %git-commit))
+    (version (git-version %pangemma-version "HEAD" %git-commit))
     (source (local-file %source-dir #:recursive? #t))
     (build-system gnu-build-system)
     (inputs
@@ -58,8 +124,6 @@
            zlib))
        ;; ("gsl-static" ,gsl-static)
        ;; ("zlib:static" ,zlib "static")
-    (propagated-inputs
-     (list guile-3.0))
     (native-inputs ; for running tests
      (list perl which))
     (arguments
@@ -84,4 +148,6 @@ provides a standard linear mixed model resolver with application in
 genome-wide association studies (GWAS).")
     (license license:gpl3)))
 
+
+
 gemma-git
diff --git a/premake5.lua b/premake5.lua
new file mode 100644
index 0000000..c5cf597
--- /dev/null
+++ b/premake5.lua
@@ -0,0 +1,66 @@
+-- Build with
+--
+--   premake5 gmake2 && make verbose=1 gemmalib -j 8
+--
+-- Including bin
+--
+--   premake5 gmake2 && make verbose=1 config=debug
+--
+-- Or
+--
+--   premake5 gmake2 && make verbose=1 config=release
+--
+-- Run
+--
+--   LD_LIBRARY_PATH=$GUIX_ENVIRONMENT/lib ./build/bin/Debug/PanGemma
+--
+-- Next we start using the API from guile with
+--
+--   env LD_LIBRARY_PATH=./build/bin/Debug/:$GUIX_ENVIRONMENT/lib guile
+--   (load-extension "libgemmalib" "init_module")
+
+workspace "PanGemma"
+   configurations { "Debug", "Release" }
+
+   project "gemmalib" -- library for interactive development
+   kind "SharedLib"
+   defines { "OPENBLAS" }
+   language "C++"
+   objdir "build/"
+   targetdir "build/bin/%{cfg.buildcfg}"
+
+   files { "src/*.h src/*.c src/**.hpp", "src/**.cpp" }
+   removefiles { "src/main.cpp" }
+   includedirs { "src/", os.getenv('GUIX_ENVIRONMENT') .. "/include/guile/3.0" }
+
+   links { "gsl", "z", "openblas" }
+
+   filter "configurations:Debug"
+      defines { "DEBUG" }
+      symbols "On"
+
+   filter "configurations:Release"
+      defines { "NDEBUG" }
+      optimize "On"
+
+
+
+project "gemma"
+   kind "ConsoleApp"
+   defines { "OPENBLAS" }
+   language "C++"
+   objdir "build/"
+   targetdir "build/bin/%{cfg.buildcfg}"
+
+   files { "src/*.h src/*.c src/**.hpp", "src/**.cpp" }
+   removefiles { "src/gemma_api.cpp" }
+   includedirs { "src/" }
+   links { "gsl", "z", "openblas" }
+
+   filter "configurations:Debug"
+      defines { "DEBUG" }
+      symbols "On"
+
+   filter "configurations:Release"
+      defines { "NDEBUG" }
+      optimize "On"
diff --git a/src/gemma_api.cpp b/src/gemma_api.cpp
new file mode 100644
index 0000000..77c1d56
--- /dev/null
+++ b/src/gemma_api.cpp
@@ -0,0 +1,33 @@
+// Testing bindings, see README.md and
+// https://www.gnu.org/savannah-checkouts/gnu/guile/docs/docs-2.0/guile-ref/Dynamic-Types.html
+
+#include <stdio.h>
+#include <libguile.h>
+// #include <libguile/boolean.h>
+// #include <libguile/numbers.h>
+
+// extern SCM my_incrementing_function (SCM a, SCM flag);
+
+// extern "C" SCM my_ping(SCM i);
+
+static SCM my_ping(SCM i) {
+    return i;
+}
+
+
+SCM my_incrementing_function (SCM a, SCM flag)
+{
+  SCM result;
+
+  if (scm_is_true (flag))
+    result = scm_sum (a, scm_from_int (1));
+  else
+    result = a;
+
+  return result;
+}
+
+
+extern "C" void init_module() {
+    scm_c_define_gsubr("my-ping", 1, 0, 0, (scm_t_subr)my_ping);
+}
diff --git a/src/mvlmm.cpp b/src/mvlmm.cpp
index eb23900..1f64c3a 100644
--- a/src/mvlmm.cpp
+++ b/src/mvlmm.cpp
@@ -210,6 +210,31 @@ void MVLMM::WriteFiles() {
 }
 
 // Below are functions for EM algorithm.
+
+/*
+  src/mvlmm.cpp
+> describe `EigenProc`
+
+The EigenProc function performs the following steps:
+
+ 1 Eigen Decomposition of V_e: It computes the eigen decomposition of the matrix V_e to obtain the eigenvectors
+   and eigenvalues. This is used to transform V_e into a diagonal form.
+ 2 Calculate V_e_h and V_e_hi: It calculates the matrices V_e_h and V_e_hi, which are used to transform the
+   original matrices into a space where the eigenvalues are more manageable. V_e_h is a scaled version of the
+   eigenvectors, and V_e_hi is the inverse of V_e_h.
+ 3 Calculate Lambda: It computes the matrix Lambda as V_ehi * V_g * V_ehi, which is a transformed version of
+   V_g in the space defined by the eigenvectors of V_e.
+ 4 Eigen Decomposition of Lambda: It performs an eigen decomposition on Lambda to obtain the eigenvectors and
+   eigenvalues, which are used to further transform the matrices.
+ 5 Calculate UltVeh and UltVehi: It calculates UltVeh and UltVehi, which are used in subsequent calculations to
+   transform vectors and matrices into the space defined by the eigenvectors of Lambda.
+ 6 Return logdet_Ve: The function returns the logarithm of the determinant of V_e, which is used in likelihood
+   calculations.
+
+This function is crucial for transforming the problem into a space where the matrices are diagonal, simplifying
+many calculations in the context of mixed models.
+
+ */
 double EigenProc(const gsl_matrix *V_g, const gsl_matrix *V_e, gsl_vector *D_l,
                  gsl_matrix *UltVeh, gsl_matrix *UltVehi) {
   size_t d_size = V_g->size1;