summaryrefslogtreecommitdiff
path: root/issues/add-mouse-data-from-klaus.gmi
blob: 0df02f89a35959eb803cfd1fe7bd921f4bb59ccb (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
# Add mouse data-set

## Tags

* assigned: bonfacem
* priority: high
* status: in progress

### Notes: Thu 30 Jun 2022 17:38:59 EAT

Klaus' recently shared with us some mouse data.
Here's a snip of how that looks like:

```
mouse_ID                   BW       day         strain	sex    inf_dose	animal.no.
241	CC001_m_1	100	perc_d00	CC001	m	10 FFU	1
242	CC001_m_1	98.56	perc_d03	CC001	m	10 FFU	1
243	CC001_m_1	NA	perc_d13	CC001	m	10 FFU	1
244	CC001_m_1	NA	perc_d12	CC001	m	10 FFU	1
245	CC001_m_1	NA	perc_d10	CC001	m	10 FFU	1
246	CC001_m_1	100.92	perc_d04	CC001	m	10 FFU	1
247	CC001_m_1	98.08	perc_d01	CC001	m	10 FFU	1
248	CC001_m_1	76.21	perc_d08	CC001	m	10 FFU	1
249	CC001_m_1	93.22	perc_d05	CC001	m	10 FFU	1
250	CC001_m_1	90.42	perc_d06	CC001	m	10 FFU	1
```

I've been working on adding the above to the GN2 database.
The current challenge I have is that this data is Time Series---for the same strain, we have values indexed by time.
Also, we tag data by "animal.no." and "sex".
So for a male version of "CC001" with animal number 1, we have "CC001_m_1".
This is a problem---storing TS data---that Rob/Suheeta have highlighted in the past.
How do we go about doing this?
Currently, in GN2 we store averages of the aforementioned data.
This doesn't work out well for us: we don't have, AFAIU, a concept for "animal.no."
I would suggest we use lmdb to store this data, and work out a way to integrate it with the rest of GN2---so that we display this info on the main page.

### Notes: Thu 30 Jun 2022 21:39:15 EAT

Here's how to extract the data from the provided data-set:

Just extract the data for d1, d2, d3 separately and use each day as a separate data set.

```
> unique(dat2$day)
[1] d0 d1 d2 d3
Levels: d0 d1 d2 d3

> table(dat2$day)
d0 d1 d2 d3 
44 44 44 44

dat10 <- subset(dat2,dat2$day=="d1")
dat10

> dat10
    mouse_ID        BW day
45  BXD 50_3  94.85000  d1
46  BXD 64_1  96.36000  d1
47  BXD 29_1  96.85000  d1
48  BXD 40_3  97.69000  d1
49  BXD 49_2  97.06000  d1
50   BXD 6_5  89.03000  d1
[...]
```
### Notes: Thu 30 Jun 2022 21:42:39 EAT
Some comments from Zach:

```
I think that Klaus is referring to what we store in GN as phenotype traits.
So you'd have a separate trait page for each time series "step".'

He's probably referring to these traits:
Day 1 - https://genenetwork.org/show_trait?trait_id=13005&dataset=BXDPublish
Day 2 - https://genenetwork.org/show_trait?trait_id=13006&dataset=BXDPublish
And continues from there - you can see them with the following search (with
a few other random traits mixed in; I first just searched for "Schughart"
in the global search) -
https://genenetwork.org/gsearch?type=phenotype&terms=H1N1

 You're correct about there not being a (good) way to deal with something
like animal number currently. The way we deal with something like that is
to create a new group, with the "strain list" being a list of individuals.
```