gn-gemtext - GeneNetwork gemtext knowledge base and issue tracker (mirror right now)

diff options


context:
space:
mode:

Diffstat (limited to 'issues/gemma/gemma2-has-different-output-from-rqtl2.gmi')

-rw-r--r--

issues/gemma/gemma2-has-different-output-from-rqtl2.gmi

1 files changed, 80 insertions, 0 deletions

diff --git a/issues/gemma/gemma2-has-different-output-from-rqtl2.gmi b/issues/gemma/gemma2-has-different-output-from-rqtl2.gmi
new file mode 100644
index 0000000..a0b2c5c
--- /dev/null
+++ b/issues/gemma/gemma2-has-different-output-from-rqtl2.gmi

@@ -0,0 +1,80 @@

+# GEMMA output differs from R/qtl2

+# Tags

+* assigned: pjotrp, davea

+* priority: high

+* type: bug, enhancement

+* status: closed

+* keywords: database, gemma, reaper, rqtl2

+# Description

+When running trait BXD_21526 results differ significantly.

+=> https://genenetwork.org/show_trait?trait_id=21526&dataset=BXDPublish

+=> https://genenetwork.org/show_trait?trait_id=21529&dataset=BXDPublish

+So I confirm I am getting the same results as Dave in GN for GEMMA (see Conclusion below).

+# Tasks

+## GeneNetwork

+I run GEMMA for precompute on the command line and that I confirmed to

+be the same as what we see in the browser. This suggests either data

+or method is different with Dave's approach.

+I confirmed that gemma in GN matches Dave's results. It is interesting

+to see that running without LOCO has some impact, but not as bad as

+the R/qtl2 difference. First we should check the genotype files to see

+if they match. I checked that the phenotypes match.

+Our inputs are different if I count genotypes (first yours, the other

+on production):

+```

+ 1 2184941 B

+ 2 2132744 D

+ 3 628980 H

+ 1 2195662 B

+ 2 2142959 D

+ 3 650168 H

+```

+The number of rows/markers is the same. So we probably added some

+genometypes, but if we miss one that would matter. Dave you can find

+the file in /home/wrk/BXD.geno on tux02 if you want to look.

+I notice that we don't use H in the R/qtl2 control file. That

+might make a difference though it probably won't explain what we see

+now. BTW I also correlated the LOD scores from GEMMA and R/qtl2 in

+the spreadsheet and at 0.7 that is too low. So it is probably not

+just a magnitude problem. The results differ a lot in your

+spreadsheet.

+Next step is that I need to run R/qtl2 using the script in your

+dropbox and see what Karl's code does. The exercise does not hurt

+because it will help us bring R/qtl2 to GN.

+## R/qtl2

+R/qtl2 is packaged in guix and can be run in a shell with

+```

+guix shell -C r r-qtl2

+> library(qtl2)

+> bxd <- read_cross2(file = "bxd_cancer_new_GN_July_2024.json")

+Warning messages:

+1: In recode_geno(sheet, genotypes) :

+ 630519 genotypes treated as missing: "H", "U"

+2: In matrix(as.numeric(unlist(pheno)), ncol = nc) :

+ NAs introduced by coercion

+3: In check_cross2(output) : Physical map out of order on chr 1, 2, 11, 19

+```

+The first warning matches above. If data is missing it may be filtered out. We'll have to check for that. The third warning I am not sure about. Probably a ranking of markers.

+# Conclusion

+It turned out that R/qtl was running HK - so it was a QTL mapping rather than an LMM.