From 2fc757f9730d07198ddcc09d2716212dc98af674 Mon Sep 17 00:00:00 2001 From: Frederick Muriuki Muriithi Date: Mon, 28 Mar 2022 05:13:51 +0300 Subject: Add notes on traits to the gn-hacking-documentation issue --- topics/documentation/gn-hacking-documentation.gmi | 69 +++++++++++++++++------ 1 file changed, 51 insertions(+), 18 deletions(-) diff --git a/topics/documentation/gn-hacking-documentation.gmi b/topics/documentation/gn-hacking-documentation.gmi index 60b34a3..b744053 100644 --- a/topics/documentation/gn-hacking-documentation.gmi +++ b/topics/documentation/gn-hacking-documentation.gmi @@ -32,6 +32,12 @@ Datasets 'contain' or organise traits. The do not have much in terms of direct o They can be envisioned as a bag of traits. +Common dataset traits are: + +* dataset_name : Name of the dataset +* dataset_type : Type of dataset. Valid values are 'Temp', 'Publish', 'ProbeSet' and 'Genotype' +* group_name : ?? + ### Traits A trait is a abstract concept - with the somewhat concrete forms being @@ -41,21 +47,38 @@ A trait is a abstract concept - with the somewhat concrete forms being * Publish * Temp -Here, my understanding is spotty. - -What are the differences between these? - -The genotype traits probably have something to do with actual genes. - -What is a ProbeSet Trait? - -What is a Publish Trait? - -What is a Temp Trait? - -The thing that seems common among all trait types is that they have: - -* samples/strains - some sort of name e.g. BXD12 +From the GeneNetwork2 repository, specifically the `wqflask.base.trait` module: + +``` +... a trait in webqtl, can be either Microarray, Published phenotype, genotype, +or user input trait +``` + +From the `wqflask.base.trait.GeneralTrait` class, the common properties for all the trait types above are: + +* dataset : a pointer to the dataset that the trait is a member of +* trait_name : the name of the trait +* cellid : ? +* identification: : ? +* haveinfo : ? +* sequence : ? +* data : ? - In GN2, retrieval of this is indirect, via the dataset but it is a trait property. +* view : ? +* locus : ? +* lrs : Lifetime reproductive success? +* pValue : ? +* mean : ? +* additive : ? +* num_overlap : ? +* strand_probe : ? +* symbol : ? +* display_name : a name to use in the display of the trait on the UI +* LRS_score_repr : ? + + +The *data* property of a trait has items with at least the following important properties: + +* sample/strain name- some sort of name e.g. BXD12 * value - a numerical value corresponding to the sample/strain * variance - a numerical value corresponding to the sample/strain * ndata - a numerical value @@ -64,12 +87,22 @@ the trait properties above are the ones I have run into that seem to be used in There are other properties like: -* mb (Megabases?) -* chr (Chromosome?) +* mb : Megabases? +* chr : Chromosome? +* location : ? that are used less often. -Each of the different types of the traits then has other properties, that thus far, seem to be used for display purposes only, e.g. "pre_publication_description" in "Publish" traits. +Some extra properties for 'ProbeSet' traits: + +* description : ? +* probe_target_description : ? + +Some extra properties for 'Publish' traits: + +* confidential : ? +* pre_publication_description : ? +* post_publication_description : ? When doing computations, it is unnecessary to load the display-only properties of a trait, deferring this to when/if we need to display such to the user/client. -- cgit v1.2.3