summaryrefslogtreecommitdiff
path: root/issues/gemma
diff options
context:
space:
mode:
Diffstat (limited to 'issues/gemma')
-rw-r--r--issues/gemma/gemma2-has-different-output-from-rqtl2.gmi27
1 files changed, 24 insertions, 3 deletions
diff --git a/issues/gemma/gemma2-has-different-output-from-rqtl2.gmi b/issues/gemma/gemma2-has-different-output-from-rqtl2.gmi
index f379ee5..0f33b8e 100644
--- a/issues/gemma/gemma2-has-different-output-from-rqtl2.gmi
+++ b/issues/gemma/gemma2-has-different-output-from-rqtl2.gmi
@@ -19,6 +19,8 @@ So I confirm I am getting the same results as Dave in GN for GEMMA.
# Tasks
+## GeneNetwork
+
I run GEMMA for precompute on the command line and that I confirmed to
be the same as what we see in the browser. This suggests either data
or method is different with Dave's approach.
@@ -28,7 +30,8 @@ to see that running without LOCO has some impact, but not as bad as
the R/qtl2 difference. First we should check the genotype files to see
if they match. I checked that the phenotypes match.
-Our inputs are different if I count genotypes (first yours, the other on production):
+Our inputs are different if I count genotypes (first yours, the other
+on production):
```
1 2184941 B
@@ -43,13 +46,31 @@ The number of rows/markers is the same. So we probably added some
genometypes, but if we miss one that would matter. Dave you can find
the file in /home/wrk/BXD.geno on tux02 if you want to look.
-Dave, I notice that you don't use H in the R/qtl2 control file. That
+I notice that we don't use H in the R/qtl2 control file. That
might make a difference though it probably won't explain what we see
now. BTW I also correlated the LOD scores from GEMMA and R/qtl2 in
-your spreadsheet and at 0.7 that is too low. So it is probably not
+the spreadsheet and at 0.7 that is too low. So it is probably not
just a magnitude problem. The results differ a lot in your
spreadsheet.
Next step is that I need to run R/qtl2 using the script in your
dropbox and see what Karl's code does. The exercise does not hurt
because it will help us bring R/qtl2 to GN.
+
+## R/qtl2
+
+R/qtl2 is packaged in guix and can be run in a shell with
+
+```
+guix shell -C r r-qtl2
+> library(qtl2)
+> bxd <- read_cross2(file = "bxd_cancer_new_GN_July_2024.json")
+Warning messages:
+1: In recode_geno(sheet, genotypes) :
+ 630519 genotypes treated as missing: "H", "U"
+2: In matrix(as.numeric(unlist(pheno)), ncol = nc) :
+ NAs introduced by coercion
+3: In check_cross2(output) : Physical map out of order on chr 1, 2, 11, 19
+```
+
+The first warning matches above. If data is missing it may be filtered out. We'll have to check for that. The third warning I am not sure about. Probably a ranking of markers.