From d9e47731f8a1616b06fdc1ef2dbd3cf50413706d Mon Sep 17 00:00:00 2001
From: Pjotr Prins
Date: Sun, 10 Dec 2023 08:12:19 -0600
Subject: Precompute notes on NAs

---
 .../mariadb/precompute-mapping-input-data.gmi      | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/topics/systems/mariadb/precompute-mapping-input-data.gmi b/topics/systems/mariadb/precompute-mapping-input-data.gmi
index 12c21da..89bd8be 100644
--- a/topics/systems/mariadb/precompute-mapping-input-data.gmi
+++ b/topics/systems/mariadb/precompute-mapping-input-data.gmi
@@ -979,6 +979,28 @@ Genotype state lives in 4 places. Time to create a 5th one with lmdb ;). At leas
 
 Using this information we created our first phenotype file and GEMMA run!
 
+# Notes
+
+## NAs in GN
+
+A note from Zach:
+
+On Sat, Dec 09, 2023 at 06:09:56PM -0600, Zachary Sloan wrote:
+>    (After typing the rest of this out, I realized that part of the
+>    confusion might be about how locations are stored. We don't actually
+>    database locations in the ProbeSetXRef table - we only database the
+>    peak Locus marker name. This is then cross-referenced against the Geno
+>    table, where the actual Location is stored. This is the main source of
+>    the problem. So I think the best short-term solution might be to just
+>    directly database the locations in the ProbeSetXRef table. Those
+>    locations might become out of date, but as you mention they'd still
+>    probably be in the same ballpark.)
+
+It is logical to store the location with the peak - if it changes we
+should recompute. That also adds the idea that we should track the
+version of the genotypes in that table.
+
+
 ## More complicated datasets
 
 A good dataset to take apart is
-- 
cgit v1.2.3