diff options
author | Pjotr Prins | 2023-12-17 09:18:10 -0600 |
---|---|---|
committer | Pjotr Prins | 2023-12-17 09:18:10 -0600 |
commit | 8956139bbc8bd6d64b345bfb272c15f73d5b7137 (patch) | |
tree | 53d8363df9659ea71feaba0168c7b3b57994ac47 /topics/systems/mariadb | |
parent | 498da8b8b1efab85cbab94a3148a900928a0637c (diff) | |
download | gn-gemtext-8956139bbc8bd6d64b345bfb272c15f73d5b7137.tar.gz |
precompute
Diffstat (limited to 'topics/systems/mariadb')
-rw-r--r-- | topics/systems/mariadb/precompute-mapping-input-data.gmi | 16 |
1 files changed, 16 insertions, 0 deletions
diff --git a/topics/systems/mariadb/precompute-mapping-input-data.gmi b/topics/systems/mariadb/precompute-mapping-input-data.gmi index 89bd8be..6962eb0 100644 --- a/topics/systems/mariadb/precompute-mapping-input-data.gmi +++ b/topics/systems/mariadb/precompute-mapping-input-data.gmi @@ -979,6 +979,16 @@ Genotype state lives in 4 places. Time to create a 5th one with lmdb ;). At leas Using this information we created our first phenotype file and GEMMA run! +# Storing output + +To kick off precompute we added new nodes to the Octopus cluster: doubling its capacity. In the next step we have to compress the output of GEMMA so we can keep it forever. For this we want to have the peaks (obviously), but we als want to retain the 'shape' of the distribution - i.e., the QTL with sign. This shape we can use for correlations and potentially some AI-style mining. The way it is presented in AraQTL. + +For the sign we can use the SNP additive effect estimate. The se of Beta is a function of MAF of the SNP. So if you want to present Beta as the SNP additive effect for a standardized genotype, then you want to use Beta/se; otherwise, Beta is the SNP additive effect for the original, unstandardized genotype. The Beta is obtained by controlling for population structure. For effect sign we need to check the incoming genotypes because they may have been switched. + +Anyway, we can consider compressing the shape the way a CDROM is compressed. + + + # Notes ## NAs in GN @@ -1008,3 +1018,9 @@ A good dataset to take apart is => http://genenetwork.org/show_trait?trait_id=1436869_at&dataset=HC_M2_0606_P because it has 71 BXD samples and 32 other samples. + +## Variations + +This paper discusses a number of approaches that may be interesting: + +=> https://biodatamining.biomedcentral.com/articles/10.1186/s13040-023-00331-3 Automated quantitative trait locus analysis (AutoQTL) |