diff options
Diffstat (limited to 'issues/gemma/gemma2-has-different-output-from-rqtl2.gmi')
-rw-r--r-- | issues/gemma/gemma2-has-different-output-from-rqtl2.gmi | 27 |
1 files changed, 24 insertions, 3 deletions
diff --git a/issues/gemma/gemma2-has-different-output-from-rqtl2.gmi b/issues/gemma/gemma2-has-different-output-from-rqtl2.gmi index f379ee5..0f33b8e 100644 --- a/issues/gemma/gemma2-has-different-output-from-rqtl2.gmi +++ b/issues/gemma/gemma2-has-different-output-from-rqtl2.gmi @@ -19,6 +19,8 @@ So I confirm I am getting the same results as Dave in GN for GEMMA. # Tasks +## GeneNetwork + I run GEMMA for precompute on the command line and that I confirmed to be the same as what we see in the browser. This suggests either data or method is different with Dave's approach. @@ -28,7 +30,8 @@ to see that running without LOCO has some impact, but not as bad as the R/qtl2 difference. First we should check the genotype files to see if they match. I checked that the phenotypes match. -Our inputs are different if I count genotypes (first yours, the other on production): +Our inputs are different if I count genotypes (first yours, the other +on production): ``` 1 2184941 B @@ -43,13 +46,31 @@ The number of rows/markers is the same. So we probably added some genometypes, but if we miss one that would matter. Dave you can find the file in /home/wrk/BXD.geno on tux02 if you want to look. -Dave, I notice that you don't use H in the R/qtl2 control file. That +I notice that we don't use H in the R/qtl2 control file. That might make a difference though it probably won't explain what we see now. BTW I also correlated the LOD scores from GEMMA and R/qtl2 in -your spreadsheet and at 0.7 that is too low. So it is probably not +the spreadsheet and at 0.7 that is too low. So it is probably not just a magnitude problem. The results differ a lot in your spreadsheet. Next step is that I need to run R/qtl2 using the script in your dropbox and see what Karl's code does. The exercise does not hurt because it will help us bring R/qtl2 to GN. + +## R/qtl2 + +R/qtl2 is packaged in guix and can be run in a shell with + +``` +guix shell -C r r-qtl2 +> library(qtl2) +> bxd <- read_cross2(file = "bxd_cancer_new_GN_July_2024.json") +Warning messages: +1: In recode_geno(sheet, genotypes) : + 630519 genotypes treated as missing: "H", "U" +2: In matrix(as.numeric(unlist(pheno)), ncol = nc) : + NAs introduced by coercion +3: In check_cross2(output) : Physical map out of order on chr 1, 2, 11, 19 +``` + +The first warning matches above. If data is missing it may be filtered out. We'll have to check for that. The third warning I am not sure about. Probably a ranking of markers. |