# Questions to ask GN We are asking GN users to list questions here that should return information through the GN APIs. We will add information on (proposed) API endpoints. The GN API is used by the gnapi R package https://github.com/kbroman/gnapi ## Is it live? ``` curl https://genenetwork.org/api/v_pre1/ {"hello":"world"} ``` ## Return species * Current: ``` curl -s https://genenetwork.org/api/v_pre1/species|jq '.[0:2]' [ { "FullName": "Mus musculus", "Id": 1, "Name": "mouse", "TaxonomyId": 10090 }, { "FullName": "Rattus norvegicus", "Id": 2, "Name": "rat", "TaxonomyId": 10116 } ] ``` ## Return available groups/populations * Current: https://genenetwork.org/api/v_pre1/groups/mouse * Proposed: https://genenetwork.org/api/v_pre1/mouse/groups (?) ```sh curl -s http://genenetwork.org/api/v_pre1/groups/mouse |jq '.[8:10]' -M ``` ```js [ { "DisplayName": "B6D2F2 OHSU Striatum", "FullName": "B6D2F2 OHSU Striatum", "GeneticType": "intercross", "Id": 12, "MappingMethodId": "1", "Name": "BDF2-2005", "SpeciesId": 1, "public": 2 }, { "DisplayName": "Mouse Diversity Panel", "FullName": "Mouse Diversity Panel", "GeneticType": "None", "Id": 15, "MappingMethodId": "1", "Name": "MDP", "SpeciesId": 1, "public": 2 } ] ``` ## Return cross info There is a bug in this one. It is supposed to return something like ``` {"species_id":1,"species":"mouse","mapping_method_id":1,"group_id":1,"group":"BXD","genetic_type":"riset","chr_info":[["1",197195432],["2",181748087],["3",159599783],["4",155630120],["5",152537259],["6",149517037],["7",152524553],["8",131738871],["9",124076172],["10",129993255],["11",121843856],["12",121257530],["13",120284312],["14",125194864],["15",103494974],["16",98319150],["17",95272651],["18",90772031],["19",61342430],["X",166650296]]} ``` Errors ``` curl -s https://genenetwork.org/api/v_pre1/group/BXD File "/home/gn2/production/gene/wqflask/wqflask/api/router.py", line 157, in get_group_info group = results.fetchone() AttributeError: 'tuple' object has no attribute 'fetchone' ``` ## Return Genotypes Return all genotypes for a specific population. * Current https://genenetwork.org/api/v_pre1/genotypes/HSNIH-Palmer_true.geno without _true we get a different and incomplete file. The API code is at https://github.com/genenetwork/genenetwork2/blob/8bfb79da9b8dc0591532939dca97e0fa9c06c5d2/wqflask/wqflask/api/router.py#L803 You can see it simply returns a file - so we have a geno file named HSNIH-Palmer_true.geno and that is what it returns. According to above code we can get .geno, .csv, .rqtl2, .bimbam, etc. as long as the file exists. Standardization of genotype data format would be helpful. Alternatively, a query that tells the user what genotype formats are available. * Proposed: return available files ## Return trait metadata Return trait metadata such as probeset info or other "trait covariates" for the high-dimensional traits. There is also the information on how the data was collected or processed. * Current: through SPARQL * Proposed: https://genenetwork.org/api/v_pre1/mouse/bxd/HC_M2_0606_P/1436869_at ## List datasets ``` curl -s https://genenetwork.org/api/v_pre1/datasets/BXD|jq ".[0:2]" [ { "AvgID": 1, "CreateTime": "Fri, 01 Aug 2003 00:00:00 GMT", "DataScale": "log2", "FullName": "Brain U74Av2 08/03 MAS5", "Id": 1, "Long_Abbreviation": "BXDMicroArray_ProbeSet_August03", "ProbeFreezeId": 337, "ShortName": "Brain U74Av2 08/03 MAS5", "Short_Abbreviation": "Br_U_0803_M", "confidentiality": 0, "public": 0 }, { "AvgID": 1, "CreateTime": "Sun, 01 Jun 2003 00:00:00 GMT", "DataScale": "log2", "FullName": "UTHSC Brain mRNA U74Av2 (Jun03) MAS5", "Id": 2, "Long_Abbreviation": "BXDMicroArray_ProbeSet_June03", "ProbeFreezeId": 10, "ShortName": "Brain U74Av2 06/03 MAS5", "Short_Abbreviation": "Br_U_0603_M", "confidentiality": 0, "public": 0 } ] ``` ## Get information on a dataset Use the dataset name ``` curl -s https://genenetwork.org/api/v_pre1/dataset/KIN_YSM_HIP_0711.json|jq { "confidential": 0, "data_scale": "log2", "dataset_type": "mRNA expression", "full_name": "Human Hippocampus Affy Hu-Exon 1.0 ST (Jul11) Quantile", "id": 337, "name": "KIN_YSM_HIP_0711", "public": 1, "short_name": "KIN/YSM Human HIP Affy Hu-Exon 1.0 ST (Jul11) Quantile", "tissue": "Hippocampus mRNA", "tissue_id": 9 } ``` Use the ProbeFreezeId above (correct?) ``` curl -s https://genenetwork.org/api/v_pre1/dataset/10.json|jq { "confidential": 0, "data_scale": "log2", "dataset_type": "mRNA expression", "full_name": "Eye M430v2 No Mutant/Mutant (Aug12) RMA", "id": 10, "name": "gn10", "public": 1, "short_name": "Eye M430v2 No Mutant/Mutant (Aug12) RMA", "tissue": "Eye mRNA", "tissue_id": 10 } ``` ## Return sample data Return all traits in a dataset. ``` curl -s https://genenetwork.org/api/v_pre1/traits/HC_U_0304_R.json|jq ".[0:2]" [ { "Additive": 0.0803547619047631, "Aliases": "T3g; Ctg3; Ctg-3", "Chr": "9", "Description": "CD3d antigen, gamma polypeptide", "Id": 1, "LRS": 12.2805314427567, "Locus": "rsm10000021399", "Mb": 44.970689, "Mean": 8.14033666666667, "Name": "100001_at", "P-Value": 0.118, "SE": 0.023595817125580502, "Symbol": "Cd3g" }, { "Additive": 0.0317847222222219, "Aliases": "Intin3; Itih-3; AW108094", "Chr": "14", "Description": "inter-alpha trypsin inhibitor, heavy chain 3", "Id": 2, "LRS": 8.37046436677732, "Locus": "rsm10000013342", "Mb": 30.908741, "Mean": 7.82323333333333, "Name": "100002_at", "P-Value": 0.561, "SE": 0.011720083297057399, "Symbol": "Itih3" } ] ``` Return trait by probe ``` curl -s https://genenetwork.org/api/v_pre1/trait//HC_U_0304_R/104617_at.json|jq { "additive": -0.0515941964285714, "alias": "AI182092; 0610005C13Rik; 0610005C13Rik-205", "chr": "7", "description": "RIKEN cDNA 0610005C13 protein (high kidney and liver expression)_", "id": 3690, "locus": "rsm10000026692", "lrs": 11.3682286632142, "mb": 45.568173, "mean": 8.165623333333329, "name": "104617_at", "p_value": 0.666, "se": 0.0170213555407089, "symbol": "0610005C13Rik" } ``` ## Return QTL Return the QTL (one or more) for a trait. * Proposed: http://genenetwork.org/api/v_pre1/mouse/bxd/HC_M2_0606_P/1436869_at/qtl There is also the question of more complex queries, such as with covariates. ## Mapping Return mapping results through the API. ``` curl -s "https://genenetwork.org/api/v_pre1/mapping?trait_id=10015&db=BXDPublish&method=rqtl&limit_to=10"|jq ".[0:3]" [ { "Mb": 3.010274, "cM": 3.010274, "chr": 1, "lod_score": 0.116927114593807, "name": "rs31443144" }, { "Mb": 3.492195, "cM": 3.492195, "chr": 1, "lod_score": 0.117404479202946, "name": "rs6269442" }, { "Mb": 3.511204, "cM": 3.511204, "chr": 1, "lod_score": 0.11742354952122, "name": "rs32285189" } ] ```