aboutsummaryrefslogtreecommitdiff
path: root/api/questions-to-ask-GN.md
diff options
context:
space:
mode:
Diffstat (limited to 'api/questions-to-ask-GN.md')
-rw-r--r--api/questions-to-ask-GN.md287
1 files changed, 287 insertions, 0 deletions
diff --git a/api/questions-to-ask-GN.md b/api/questions-to-ask-GN.md
new file mode 100644
index 0000000..3ad6c09
--- /dev/null
+++ b/api/questions-to-ask-GN.md
@@ -0,0 +1,287 @@
+# Questions to ask GN
+
+We are asking GN users to list questions here that should return information through the GN APIs. We will add information on (proposed) API endpoints.
+
+The GN API is used by the gnapi R package
+
+https://github.com/kbroman/gnapi
+
+
+
+## Is it live?
+
+```
+curl https://genenetwork.org/api/v_pre1/
+{"hello":"world"}
+```
+
+## Return species
+
+* Current:
+
+```
+curl -s https://genenetwork.org/api/v_pre1/species|jq '.[0:2]'
+[
+ {
+ "FullName": "Mus musculus",
+ "Id": 1,
+ "Name": "mouse",
+ "TaxonomyId": 10090
+ },
+ {
+ "FullName": "Rattus norvegicus",
+ "Id": 2,
+ "Name": "rat",
+ "TaxonomyId": 10116
+ }
+]
+```
+
+## Return available groups/populations
+
+* Current: https://genenetwork.org/api/v_pre1/groups/mouse
+* Proposed: https://genenetwork.org/api/v_pre1/mouse/groups (?)
+
+```sh
+curl -s http://genenetwork.org/api/v_pre1/groups/mouse |jq '.[8:10]' -M
+```
+
+```js
+[
+ {
+ "DisplayName": "B6D2F2 OHSU Striatum",
+ "FullName": "B6D2F2 OHSU Striatum",
+ "GeneticType": "intercross",
+ "Id": 12,
+ "MappingMethodId": "1",
+ "Name": "BDF2-2005",
+ "SpeciesId": 1,
+ "public": 2
+ },
+ {
+ "DisplayName": "Mouse Diversity Panel",
+ "FullName": "Mouse Diversity Panel",
+ "GeneticType": "None",
+ "Id": 15,
+ "MappingMethodId": "1",
+ "Name": "MDP",
+ "SpeciesId": 1,
+ "public": 2
+ }
+]
+```
+
+## Return cross info
+
+There is a bug in this one. It is supposed to return something like
+
+```
+{"species_id":1,"species":"mouse","mapping_method_id":1,"group_id":1,"group":"BXD","genetic_type":"riset","chr_info":[["1",197195432],["2",181748087],["3",159599783],["4",155630120],["5",152537259],["6",149517037],["7",152524553],["8",131738871],["9",124076172],["10",129993255],["11",121843856],["12",121257530],["13",120284312],["14",125194864],["15",103494974],["16",98319150],["17",95272651],["18",90772031],["19",61342430],["X",166650296]]}
+```
+
+Errors
+
+```
+curl -s https://genenetwork.org/api/v_pre1/group/BXD
+File "/home/gn2/production/gene/wqflask/wqflask/api/router.py", line 157, in get_group_info
+ group = results.fetchone()
+AttributeError: 'tuple' object has no attribute 'fetchone'
+```
+
+## Return Genotypes
+
+Return all genotypes for a specific population.
+
+* Current https://genenetwork.org/api/v_pre1/genotypes/HSNIH-Palmer_true.geno
+
+without _true we get a different and incomplete file.
+
+The API code is at
+https://github.com/genenetwork/genenetwork2/blob/8bfb79da9b8dc0591532939dca97e0fa9c06c5d2/wqflask/wqflask/api/router.py#L803
+
+You can see it simply returns a file - so we have a geno file named HSNIH-Palmer_true.geno and that is what it returns.
+
+According to above code we can get .geno, .csv, .rqtl2, .bimbam, etc. as long as the file exists.
+
+Standardization of genotype data format would be helpful. Alternatively, a query that tells the user what genotype formats are available.
+
+* Proposed: return available files
+
+## Return trait metadata
+
+Return trait metadata such as probeset info or other "trait covariates" for the high-dimensional traits. There is also the information on how the data was collected or processed.
+
+* Current: through SPARQL
+* Proposed: https://genenetwork.org/api/v_pre1/mouse/bxd/HC_M2_0606_P/1436869_at
+
+## List datasets
+
+```
+curl -s https://genenetwork.org/api/v_pre1/datasets/BXD|jq ".[0:2]"
+[
+ {
+ "AvgID": 1,
+ "CreateTime": "Fri, 01 Aug 2003 00:00:00 GMT",
+ "DataScale": "log2",
+ "FullName": "Brain U74Av2 08/03 MAS5",
+ "Id": 1,
+ "Long_Abbreviation": "BXDMicroArray_ProbeSet_August03",
+ "ProbeFreezeId": 337,
+ "ShortName": "Brain U74Av2 08/03 MAS5",
+ "Short_Abbreviation": "Br_U_0803_M",
+ "confidentiality": 0,
+ "public": 0
+ },
+ {
+ "AvgID": 1,
+ "CreateTime": "Sun, 01 Jun 2003 00:00:00 GMT",
+ "DataScale": "log2",
+ "FullName": "UTHSC Brain mRNA U74Av2 (Jun03) MAS5",
+ "Id": 2,
+ "Long_Abbreviation": "BXDMicroArray_ProbeSet_June03",
+ "ProbeFreezeId": 10,
+ "ShortName": "Brain U74Av2 06/03 MAS5",
+ "Short_Abbreviation": "Br_U_0603_M",
+ "confidentiality": 0,
+ "public": 0
+ }
+]
+```
+
+## Get information on a dataset
+
+Use the dataset name
+
+```
+curl -s https://genenetwork.org/api/v_pre1/dataset/KIN_YSM_HIP_0711.json|jq
+{
+ "confidential": 0,
+ "data_scale": "log2",
+ "dataset_type": "mRNA expression",
+ "full_name": "Human Hippocampus Affy Hu-Exon 1.0 ST (Jul11) Quantile",
+ "id": 337,
+ "name": "KIN_YSM_HIP_0711",
+ "public": 1,
+ "short_name": "KIN/YSM Human HIP Affy Hu-Exon 1.0 ST (Jul11) Quantile",
+ "tissue": "Hippocampus mRNA",
+ "tissue_id": 9
+}
+```
+
+Use the ProbeFreezeId above (correct?)
+
+```
+curl -s https://genenetwork.org/api/v_pre1/dataset/10.json|jq
+{
+ "confidential": 0,
+ "data_scale": "log2",
+ "dataset_type": "mRNA expression",
+ "full_name": "Eye M430v2 No Mutant/Mutant (Aug12) RMA",
+ "id": 10,
+ "name": "gn10",
+ "public": 1,
+ "short_name": "Eye M430v2 No Mutant/Mutant (Aug12) RMA",
+ "tissue": "Eye mRNA",
+ "tissue_id": 10
+}
+```
+
+## Return sample data
+
+Return all traits in a dataset.
+
+```
+ curl -s https://genenetwork.org/api/v_pre1/traits/HC_U_0304_R.json|jq ".[0:2]"
+[
+ {
+ "Additive": 0.0803547619047631,
+ "Aliases": "T3g; Ctg3; Ctg-3",
+ "Chr": "9",
+ "Description": "CD3d antigen, gamma polypeptide",
+ "Id": 1,
+ "LRS": 12.2805314427567,
+ "Locus": "rsm10000021399",
+ "Mb": 44.970689,
+ "Mean": 8.14033666666667,
+ "Name": "100001_at",
+ "P-Value": 0.118,
+ "SE": 0.023595817125580502,
+ "Symbol": "Cd3g"
+ },
+ {
+ "Additive": 0.0317847222222219,
+ "Aliases": "Intin3; Itih-3; AW108094",
+ "Chr": "14",
+ "Description": "inter-alpha trypsin inhibitor, heavy chain 3",
+ "Id": 2,
+ "LRS": 8.37046436677732,
+ "Locus": "rsm10000013342",
+ "Mb": 30.908741,
+ "Mean": 7.82323333333333,
+ "Name": "100002_at",
+ "P-Value": 0.561,
+ "SE": 0.011720083297057399,
+ "Symbol": "Itih3"
+ }
+]
+```
+
+Return trait by probe
+
+```
+curl -s https://genenetwork.org/api/v_pre1/trait//HC_U_0304_R/104617_at.json|jq
+{
+ "additive": -0.0515941964285714,
+ "alias": "AI182092; 0610005C13Rik; 0610005C13Rik-205",
+ "chr": "7",
+ "description": "RIKEN cDNA 0610005C13 protein (high kidney and liver expression)_",
+ "id": 3690,
+ "locus": "rsm10000026692",
+ "lrs": 11.3682286632142,
+ "mb": 45.568173,
+ "mean": 8.165623333333329,
+ "name": "104617_at",
+ "p_value": 0.666,
+ "se": 0.0170213555407089,
+ "symbol": "0610005C13Rik"
+}
+```
+
+## Return QTL
+
+Return the QTL (one or more) for a trait.
+
+* Proposed: http://genenetwork.org/api/v_pre1/mouse/bxd/HC_M2_0606_P/1436869_at/qtl
+
+There is also the question of more complex queries, such as with covariates.
+
+## Mapping
+
+Return mapping results through the API.
+
+```
+curl -s "https://genenetwork.org/api/v_pre1/mapping?trait_id=10015&db=BXDPublish&method=rqtl&limit_to=10"|jq ".[0:3]"
+[
+ {
+ "Mb": 3.010274,
+ "cM": 3.010274,
+ "chr": 1,
+ "lod_score": 0.116927114593807,
+ "name": "rs31443144"
+ },
+ {
+ "Mb": 3.492195,
+ "cM": 3.492195,
+ "chr": 1,
+ "lod_score": 0.117404479202946,
+ "name": "rs6269442"
+ },
+ {
+ "Mb": 3.511204,
+ "cM": 3.511204,
+ "chr": 1,
+ "lod_score": 0.11742354952122,
+ "name": "rs32285189"
+ }
+]
+```