summaryrefslogtreecommitdiff
path: root/topics/database
diff options
context:
space:
mode:
authorPjotr Prins2024-04-27 19:49:56 -0500
committerPjotr Prins2024-04-27 19:50:10 -0500
commite885823a02c863499f4e4a38011b78122ddff3d9 (patch)
tree9ca9b621512da1842ca651087cf31a12d15a4b20 /topics/database
parent3cf3dec2bb8b1cf0a6d3c08d5da8b5ed462c486b (diff)
downloadgn-gemtext-e885823a02c863499f4e4a38011b78122ddff3d9.tar.gz
Genotypes: row order
Diffstat (limited to 'topics/database')
-rw-r--r--topics/database/genotype-database.gmi12
1 files changed, 12 insertions, 0 deletions
diff --git a/topics/database/genotype-database.gmi b/topics/database/genotype-database.gmi
index 7b8eefc..df371fe 100644
--- a/topics/database/genotype-database.gmi
+++ b/topics/database/genotype-database.gmi
@@ -35,6 +35,18 @@ To learn the basics of functional databases, see the following talks:
Being a functional database, genodb can store multiple versions of the genotype matrix. These versions are stored efficiently on disk optimizing for disk usage. Two additional copies of the most recent version of the matrix are stored in read-optimized form for fast retrieval.
+### Row order
+
+> Probably caused by a mismatch between cases in GN database and the
+> genotype file. It would be great to have code that would automatically
+> update geno files when we add a sample of known type.
+
+Currently we are juggling 4 genotype formats and soon a 5th. I agree
+that dynamic genotypes would be nice and we will probably have to look
+at versioning lmdb for that. If the markers are fixed and we store
+individuals as rows that could work.
+
+
### Encoding
LMDB maps octet vector keys to octet vector values. Any data we put into a LMDB database needs to be encoded to octets (effectively aka bytes). genodb supports the following three data types with their respective encodings.